perl alarm with subprocess - windows

I have a perl script that runs a series of batch scripts for regression testing. I want to implement a timeout on the batch scripts. I currently have the following code.
my $pid = open CMD, "$cmd 2>&1 |";
eval {
# setup the alarm
local $SIG{ALRM} = sub { die "alarm\n" };
# alarm on the timeout
alarm $MAX_TIMEOUT;
log_output("setting alarm to $MAX_TIMEOUT\n");
# run our exe
while( <CMD> ) {
$$out_ref .= $_;
}
$timeRemaining = alarm 0;
};
if ($#) {
#catch the alarm, kill the executable
}
The problem is that no matter what I set the max timeout to, the alarm is never tripped. I've tried using Perl::Unsafe::Signals but that did not help.
Is this the best way to execute the batch scripts if I want to be able to capture their output? Is there another way that would do the same thing that would allow me to use alarms, or is there another method besides alarms to timeout the program?
I have built a test script to confirm that alarm works on with my perl and windows version, but it does not work when I run a command like this.
I'm running this with activeperl 5.10.1 on windows 7 x64.

It's hard to tell when alarm will work, when a system call will and won't get interrupted by a SIGALRM, how the same code might behave differently on different operating systems, etc.
If your job times out, you want to kill the subprocess you have started. This is a good use case for the poor man's alarm:
my $pid = open CMD, "$cmd 2>&1 |";
my $time = $MAX_TIMEOUT;
my $poor_mans_alarm = "sleep 1,kill(0,$pid)||exit for 1..$time;kill -9,$pid";
if (fork() == 0) {
exec($^X, "-e", $poor_mans_alarm);
die "Poor man's alarm failed to start"; # shouldn't get here
}
# on Windows, instead of fork+exec, you can say
# system 1, qq[$^X -e "$poor_mans_alarm"]
...
The poor man's alarm runs in a separate process. Every second, it checks whether the process with identifier $pid is still alive. If the process isn't alive, the alarm process exits. If the process is still alive after $time seconds, it sends a kill signal to the process (I used 9 to make it untrappable and -9 to take out the whole subprocess tree, your needs may vary).
(The exec actually may not be necessary. I use it because I also use this idiom to monitor processes that might outlive the Perl script that launched them. Since that wouldn't be the case with this problem, you could skip the exec call and say
if (fork() == 0) {
for (1..$time) { sleep 1; kill(0,$pid) || exit }
kill -9, $pid;
exit;
}
instead.)

Related

Perl: Child subprocesses are not being killed when child is being killed

This is being done on windows
I am getting error: The process cannot access the file because it is being used by another process. It seems that even after the child is exiting(exit 0) and the parent is waiting for the child to complete (waitpid($lkpid, 0)),the child's subprocesses are not being killed. Hence, when the next iteration (test case) is running, it is finding the process already running, and hence gives the error message.
Code Snippet ($bashexe and $bePath are defined):
my $MSROO = "/home/abc";
if (my $fpid = fork()) {
for (my $i=1; $i<=1200; $i++) {
sleep 1;
if (-e "$MSROO/logs/Complete") {
last;
}
}
elsif (defined ($fpid)) {
&runAndMonitor (\#ForRun, "$MSROO/logs/Test.log"); ### #ForRun has the list of test cases
system("touch $MSROO/logs/Complete");
exit 0;
}
sub runAndMonitor {
my #ForRunPerProduct = #{$_[0]};
my $logFile = $_[1];
foreach my $TestVar (#ForRunPerProduct) {
my $TestVarDirName = $TestVar;
$TestVarDirName = dirname ($TestVarDirName);
my $lkpid;
my $filehandle;
if ( !($pid = open( $filehandle, "-|" , " $bashexe -c \" echo abc \; perl.exe reg_script.pl $TestVarDirName -t wint\" >> $logFile "))) {
die( "Failed to start process: $!" );
}
else {
print "$pid is pid of shell running: $TestVar\n"; ### Issue (error message above) is coming here after piped open is launched for a new test
my $taskInfo=`tasklist | grep "$pid"`;
chomp ($taskInfo);
print "$taskInfo is taskInfo\n";
}
if ($lkpid = fork()) {
sleep 1;
chomp ($lkpid);
LabelToCheck:
my $pidExistingOrNotInParent = kill 0, $pid;
if ($pidExistingOrNotInParent) {
sleep 10;
goto LabelToCheck;
}
}
elsif (defined ($lkpid)) {
sleep 12;
my $pidExistingOrNot = kill 0, $pid;
if ($pidExistingOrNot){
print "$pid still exists\n";
my $taskInfoVar1 =`tasklist | grep "$pid"`;
chomp ($taskInfoVar1);
my $killPID = kill 15, $pid;
print "$killPID is the value of PID\n"; ### Here, I am getting output 1 (value of $killPID). Also, I tried with signal 9, and seeing same behavior
my $taskInfoVar2 =`tasklist | grep "$pid"`;
sleep 10;
exit 0;
}
}
system("TASKKILL /F /T /PID $lkpid") if ($lkpid); ### Here, child pid is not being killed . Saying "ERROR: The process "-1472" not found"
sleep 2;
print "$lkpid is lkpid\n"; ## Here, though I am getting message "-1472 is lkpid"
#waitpid($lkpid, 0);
return;
}
Why is it that even after "exit 0 in child" and then "waitpid in parent", child subprocesses are not being killed? What can be done to fully clean child process and its subprocesses?
The exit doesn't touch child processes; it's not meant to. It just exits the process. In order to shut down its child processes as well you'd need to signal them.†
However, since this is Windows, where fork is merely emulated, here is what perlfork says
Behavior of other Perl features in forked pseudo-processes
...
kill() "kill('KILL', ...)" can be used to terminate a pseudo-process by passing it the ID returned by fork(). The outcome of kill on a
pseudo-process is unpredictable and it should not be used except under dire circumstances, because the operating system may not
guarantee integrity of the process resources when a running thread is terminated
...
exit() exit() always exits just the executing pseudo-process, after automatically wait()-ing for any outstanding child pseudo-processes. Note
that this means that the process as a whole will not exit unless all running pseudo-processes have exited. See below for some
limitations with open filehandles.
So don't do kill, while exit behaves nearly opposite to what you need.
But the Windows command TASKKILL can terminate a process and its tree
system("TASKKILL /F /T /PID $pid");
This should terminate a process with $pid and its children processes. (The command can use a process's name instead, TASKKILL /F /T /IM $name, but using names on a busy modern system, with a lot going on, can be tricky.) See taskkill on MS docs.
A more reliable way about this, altogether, is probably to use dedicated modules for Windows process management.
A few other comments
I also notice that you use pipe-open, while perlfork says for that
Forking pipe open() not yet implemented
The open(FOO, "|-") and open(BAR, "-|") constructs are not yet implemented.
So I am confused, does that pipe-open work in your code? But perlfork continues with
This limitation can be easily worked around in new code by creating a pipe explicitly. The following example shows how to write to a forked child: [full code follows]
That C-style loop, for (my $i=1; $i<=1200; $i++), is better written as
for my $i (1..1200) { ... }
(or foreach, synonyms) A C-style loop is very rarely needed in Perl.
† A kill with a negative signal (name or number) OR process-id generally terminates the whole tree under the signaled process. This is on Linux.
So one way would be to signal that child from its parent when ready, instead of exit-ing from it. (Then the child would have signal the parent in some way when it's ready.)
Or, the child can send a negative terminate signal to all its direct children process, then exit.
You didn't say which perl you are using. On Windows with Strawberry Perl (and presumably Active State), fork() emulation is ... very problematic, (maybe just "broken") as #zdim mentioned. If you want a longer explanation, see Proc::Background::Win32 - Perl Fork Limitations
Meanwhile, if you use Cygwin's Perl, fork works perfectly. This is because Cygwin does a full emulation of Unix fork() semantics, so anything built against cygwin works just like it does on Unix. The downside is that file paths show up weird, like /cygdrive/c/Program Files. This may or may not trip up code you've already written.
But, you might also have confusion about process trees. Even on Unix, killing a parent process does not kill the child processes. This usually happens for various reasons, but it is not enforced. For example, most child processes have a pipe open to the parent, and when the parent exits that pipe closes and then reading/writing the pipe gives SIGPIPE that kills the child. In other cases, the parent catches SIGTERM and then re-broadcasts that to its children before exiting gracefully. In other cases, monitors like Systemd or Docker create a container inherited by all children of the main process, and when the main process exits the monitor kills off everything else in the container.
Since it looks like you're writing your own task monitor, I'll give some advice from one that I wrote for Windows (and is running along happily years later). I ended up with a design using Proc::Background where the parent starts a task that writes to a file as STDOUT/STDERR. Then it opens that same log file and wakes up every few seconds to try reading more of the log file to see what the task is doing, and check with the Proc::Background object to see if the task exited. When the task exits, it appends the exit code and timestamp to the log file. The monitor has a timeout setting that if the child exceeds, it just un-gracefully runs TerminateProcess. (you could improve on that by leaving STDIN open as a pipe between monitor and worker, and then have the worker check STDIN every now and then, but on Windows that will block, so you have to use PeekNamedPipe, which gets messy)
Meanwhile, the monitor parses any new lines of the log file to read status information and send updates to the database. The other parts of the system can watch the database to see the status of background tasks, including a web admin interface that can also open and read the log file. If the monitor sees that a child has run for too long, it can use TerminateProcess to stop it. Missing from this design is any way for the monitor to know when it's being asked to exit, and clean up, which is a notable deficiency, and one you're probably looking for. However, there actually isn't any way to intercept a TerminateProcess aimed at the parent! Windows does have some Message Queue API stuff where you can set up to receive notifications about termination, but I never chased down the full details there. If you do, please come back and drop a comment for me :-)

How to check infinite loop in bash?

I have been trying to run an executable using a bash multiple times. There is a chance that this executable will fall into infinite loop, or segfaults. I know there is no try-catch in bash but we can bypass that using:
{ #try
"myCommand" && "do what i want"
} || { #except
"handle error"
}
But this is not capable of understanding infinite loop. How can I handle this problem?
You can user timeout from the gnu coreutils.
Here a example for a timeout of 10 seconds
timeout 10s yourscript.sh
Bash can't tell you what's going on inside myCommand unless that loop sends a signal or modifies the system/environment. You could run your #try in the background &, then do something if it's still running after a certain amount of time. $! refers to the last backgrounded task.
Check out Job Control in Bash.
myCommand && doWhatIWant &
sleep 10
ps $! &>/dev/null && kill $!

Perl signal handlers and WIndows

I am on Windows with Strawberry perl. I have some GUI.pl application which run script.pl which run some.exe. The perl script works as a proxy for STDIN/OUT/ERR between GUI application and some.exe.
The problem is that I can't kill some.exe process in chain GUI.pl -> script.pl -> some.exe.
GUI.pl sends TERM to script.pl
# GUI.pl
my $pid = open my $cmd, '-|', 'script.pl';
sleep 1;
kill 'TERM', $pid;
script.pl catch 'TERM' and trying to kill some.exe
# script.pl
$SIG{TERM} = \&handler;
my $pid = open my $cmd, '-|', 'some.exe';
sub handler {
kill 'TERM', $pid;
}
With this scheme, the process of some.exe continues to be executed. I've already learned a lot about the signals but still do not understand how to resolve this problem.
Thank in advance.
And one of the solutions it is using of threads:
# script.pl
use threads;
use threads::shared;
$SIG{BREAK} = \&handler;
my $pid :shared;
async {
$pid = open my $cmd, '-|', 'some.exe'
}->detach;
# 1 second for blocking opcode. After sleep handler will be applied
sleep 1;
sub handler {
kill 'TERM', $pid;
}
I would be wary of use of 'kill' signals on Windows, as they're a POSIX thing. http://perldoc.perl.org/functions/kill.html
But I think the problem here will probably be because of Deferred Signals. Specifically if you send a signal to a process, the interpreter will wait until it's "safe" to process it. In the middle of "some.exe" is unlikely to be.
Using kill signals in this way isn't a particularly good form of IPC. See perlmonks: Signals Vs. Windows for some useful discussion.
Signals on Windows are very idiosyncratic. You may have better luck with the INT or QUIT signals than TERM. My extensive research into how Perl and Windows handle signals is summarized here.
TL;DR: On Windows, TERM can terminate a process in Windows, but it cannot be handled. INT and QUIT can be handled, and their default behavior is to terminate the process. If you use Windows pseudo-processes (which is what you get if you call fork in Windows), then things quickly get more complicated.

Application process never terminates on each run

I am seeing an application always remains live even after closing the application using my Perl script below. Also, for the subsequent runs, it always says that "The process cannot access the file because it is being used by another process. iperf.exe -u -s -p 5001 successful. Output was:"
So every time I have to change the file name $file used in script or I have to kill the iperf.exe process in the Task Manager.
Could anybody please let me know the way to get rid of it?
Here is the code I am using ...
my #command_output;
eval {
my $file = "abc6.txt";
$command = "iperf.exe -u -s -p 5001";
alarm 10;
system("$command > $file");
alarm 0;
close $file;
};
if ($#) {
warn "$command timed out.\n";
} else {
print "$command successful. Output was:\n", $file;
}
unlink $file;
Since your process didn't open $file, the close $file achieves nothing.
If the process completed in time, you would not have the problem. Therefore, you need to review why you think iperf can do its job in 10 seconds and why it thinks it can't.
Further, if the timeout occurs, you should probably aim to terminate the child process. On Unix, you might send it SIGTERM, SIGHUP and SIGKILL signals in sequence, with a short pause (1 second each, perhaps) between. The first two are polite requests to get the hell out of Dodge City; the last is the ultimate death threat. Of course, you have to know which process to send the signal too - that may be trickier to determine with 'system' and Windows than on Unix.

Why can't I use job control in a bash script?

In this answer to another question, I was told that
in scripts you don't have job control
(and trying to turn it on is stupid)
This is the first time I've heard this, and I've pored over the bash.info section on Job Control (chapter 7), finding no mention of either of these assertions. [Update: The man page is a little better, mentioning 'typical' use, default settings, and terminal I/O, but no real reason why job control is particularly ill-advised for scripts.]
So why doesn't script-based job-control work, and what makes it a bad practice (aka 'stupid')?
Edit: The script in question starts a background process, starts a second background process, then attempts to put the first process back into the foreground so that it has normal terminal I/O (as if run directly), which can then be redirected from outside the script. Can't do that to a background process.
As noted by the accepted answer to the other question, there exist other scripts that solve that particular problem without attempting job control. Fine. And the lambasted script uses a hard-coded job number — Obviously bad. But I'm trying to understand whether job control is a fundamentally doomed approach. It still seems like maybe it could work...
What he meant is that job control is by default turned off in non-interactive mode (i.e. in a script.)
From the bash man page:
JOB CONTROL
Job control refers to the ability to selectively stop (suspend)
the execution of processes and continue (resume) their execution at a
later point.
A user typically employs this facility via an interactive interface
supplied jointly by the system’s terminal driver and bash.
and
set [--abefhkmnptuvxBCHP] [-o option] [arg ...]
...
-m Monitor mode. Job control is enabled. This option is on by
default for interactive shells on systems that support it (see
JOB CONTROL above). Background processes run in a separate
process group and a line containing their exit status is
printed upon their completion.
When he said "is stupid" he meant that not only:
is job control meant mostly for facilitating interactive control (whereas a script can work directly with the pid's), but also
I quote his original answer, ... relies on the fact that you didn't start any other jobs previously in the script which is a bad assumption to make. Which is quite correct.
UPDATE
In answer to your comment: yes, nobody will stop you from using job control in your bash script -- there is no hard case for forcefully disabling set -m (i.e. yes, job control from the script will work if you want it to.) Remember that in the end, especially in scripting, there always are more than one way to skin a cat, but some ways are more portable, more reliable, make it simpler to handle error cases, parse the output, etc.
You particular circumstances may or may not warrant a way different from what lhunath (and other users) deem "best practices".
Job control with bg and fg is useful only in interactive shells. But & in conjunction with wait is useful in scripts too.
On multiprocessor systems spawning background jobs can greatly improve the script's performance, e.g. in build scripts where you want to start at least one compiler per CPU, or process images using ImageMagick tools parallely etc.
The following example runs up to 8 parallel gcc's to compile all source files in an array:
#!bash
...
for ((i = 0, end=${#sourcefiles[#]}; i < end;)); do
for ((cpu_num = 0; cpu_num < 8; cpu_num++, i++)); do
if ((i < end)); then gcc ${sourcefiles[$i]} & fi
done
wait
done
There is nothing "stupid" about this. But you'll require the wait command, which waits for all background jobs before the script continues. The PID of the last background job is stored in the $! variable, so you may also wait ${!}. Note also the nice command.
Sometimes such code is useful in makefiles:
buildall:
for cpp_file in *.cpp; do gcc -c $$cpp_file & done; wait
This gives much finer control than make -j.
Note that & is a line terminator like ; (write command& not command&;).
Hope this helps.
Job control is useful only when you are running an interactive shell, i.e., you know that stdin and stdout are connected to a terminal device (/dev/pts/* on Linux). Then, it makes sense to have something on foreground, something else on background, etc.
Scripts, on the other hand, doesn't have such guarantee. Scripts can be made executable, and run without any terminal attached. It doesn't make sense to have foreground or background processes in this case.
You can, however, run other commands non-interactively on the background (appending "&" to the command line) and capture their PIDs with $!. Then you use kill to kill or suspend them (simulating Ctrl-C or Ctrl-Z on the terminal, it the shell was interactive). You can also use wait (instead of fg) to wait for the background process to finish.
It could be useful to turn on job control in a script to set traps on
SIGCHLD. The JOB CONTROL section in the manual says:
The shell learns immediately whenever a job changes state. Normally,
bash waits until it is about to print a prompt before reporting
changes in a job's status so as to not interrupt any other output. If
the -b option to the set builtin command is enabled, bash reports
such changes immediately. Any trap on SIGCHLD is executed for each
child that exits.
(emphasis is mine)
Take the following script, as an example:
dualbus#debian:~$ cat children.bash
#!/bin/bash
set -m
count=0 limit=3
trap 'counter && { job & }' CHLD
job() {
local amount=$((RANDOM % 8))
echo "sleeping $amount seconds"
sleep "$amount"
}
counter() {
((count++ < limit))
}
counter && { job & }
wait
dualbus#debian:~$ chmod +x children.bash
dualbus#debian:~$ ./children.bash
sleeping 6 seconds
sleeping 0 seconds
sleeping 7 seconds
Note: CHLD trapping seems to be broken as of bash 4.3
In bash 4.3, you could use 'wait -n' to achieve the same thing,
though:
dualbus#debian:~$ cat waitn.bash
#!/home/dualbus/local/bin/bash
count=0 limit=3
trap 'kill "$pid"; exit' INT
job() {
local amount=$((RANDOM % 8))
echo "sleeping $amount seconds"
sleep "$amount"
}
for ((i=0; i<limit; i++)); do
((i>0)) && wait -n; job & pid=$!
done
dualbus#debian:~$ chmod +x waitn.bash
dualbus#debian:~$ ./waitn.bash
sleeping 3 seconds
sleeping 0 seconds
sleeping 5 seconds
You could argue that there are other ways to do this in a more
portable way, that is, without CHLD or wait -n:
dualbus#debian:~$ cat portable.sh
#!/bin/sh
count=0 limit=3
trap 'counter && { brand; job & }; wait' USR1
unset RANDOM; rseed=123459876$$
brand() {
[ "$rseed" -eq 0 ] && rseed=123459876
h=$((rseed / 127773))
l=$((rseed % 127773))
rseed=$((16807 * l - 2836 * h))
RANDOM=$((rseed & 32767))
}
job() {
amount=$((RANDOM % 8))
echo "sleeping $amount seconds"
sleep "$amount"
kill -USR1 "$$"
}
counter() {
[ "$count" -lt "$limit" ]; ret=$?
count=$((count+1))
return "$ret"
}
counter && { brand; job & }
wait
dualbus#debian:~$ chmod +x portable.sh
dualbus#debian:~$ ./portable.sh
sleeping 2 seconds
sleeping 5 seconds
sleeping 6 seconds
So, in conclusion, set -m is not that useful in scripts, since
the only interesting feature it brings to scripts is being able to
work with SIGCHLD. And there are other ways to achieve the same thing
either shorter (wait -n) or more portable (sending signals yourself).
Bash does support job control, as you say. In shell script writing, there is often an assumption that you can't rely on the fact that you have bash, but that you have the vanilla Bourne shell (sh), which historically did not have job control.
I'm hard-pressed these days to imagine a system in which you are honestly restricted to the real Bourne shell. Most systems' /bin/sh will be linked to bash. Still, it's possible. One thing you can do is instead of specifying
#!/bin/sh
You can do:
#!/bin/bash
That, and your documentation, would make it clear your script needs bash.
Possibly o/t but I quite often use nohup when ssh into a server on a long-running job so that if I get logged out the job still completes.
I wonder if people are confusing stopping and starting from a master interactive shell and spawning background processes? The wait command allows you to spawn a lot of things and then wait for them all to complete, and like I said I use nohup all the time. It's more complex than this and very underused - sh supports this mode too. Have a look at the manual.
You've also got
kill -STOP pid
I quite often do that if I want to suspend the currently running sudo, as in:
kill -STOP $$
But woe betide you if you've jumped out to the shell from an editor - it will all just sit there.
I tend to use mnemonic -KILL etc. because there's a danger of typing
kill - 9 pid # note the space
and in the old days you could sometimes bring the machine down because it would kill init!
jobs DO work in bash scripts
BUT, you ... NEED to watch for the spawned staff
like:
ls -1 /usr/share/doc/ | while read -r doc ; do ... done
jobs will have different context on each side of the |
bypassing this may be using for instead of while:
for `ls -1 /usr/share/doc` ; do ... done
this should demonstrate how to use jobs in a script ...
with the mention that my commented note is ... REAL (dunno why that behaviour)
#!/bin/bash
for i in `seq 7` ; do ( sleep 100 ) & done
jobs
while [ `jobs | wc -l` -ne 0 ] ; do
for jobnr in `jobs | awk '{print $1}' | cut -d\[ -f2- |cut -d\] -f1` ; do
kill %$jobnr
done
#this is REALLY ODD ... but while won't exit without this ... dunno why
jobs >/dev/null 2>/dev/null
done
sleep 1
jobs

Resources