How do I do a non-blocking read from a pipe in Perl? - windows

I have a program which is calling another program and processing the child's output, ie:
my $pid = open($handle, "$commandPath $options |");
Now I've tried a couple different ways to read from the handle without blocking with little or no success.
I found related questions:
perl-win32-how-to-do-a-non-blocking-read-of-a-filehandle-from-another-process
why-does-my-perl-sysread-block-when-reading-from-a-socket
But they suffer from the problems:
ioctl consistently crashes perl
sysread blocks on 0 bytes (a common occurrence)
I'm not sure how to go about solving this problem.

Pipes are not as functional on Windows as they are on Unix-y systems. You can't use the 4-argument select on them and the default capacity is miniscule.
You are better off trying a socket or file based workaround.
$pid = fork();
if (defined($pid) && $pid == 0) {
exit system("$commandPath $options > $someTemporaryFile");
}
open($handle, "<$someTemporaryFile");
Now you have a couple more cans of worms to deal with -- running waitpid periodically to check when the background process has stopped creating output, calling seek $handle,0,1 to clear the eof condition after you read from $handle, cleaning up the temporary file, but it works.
I have written the Forks::Super module to deal with issues like this (and many others). For this problem you would use it like
use Forks::Super;
my $pid = fork { cmd => "$commandPath $options", child_fh => "out" };
my $job = Forks::Super::Job::get($pid);
while (!$job->is_complete) {
#someInputToProcess = $job->read_stdout();
... process input ...
... optional sleep here so you don't consume CPU waiting for input ...
}
waitpid $pid, 0;
#theLastInputToProcess = $job->read_stdout();

Related

Can you compare values across probes in a multi-CPU safe way in DTrace?

I'm trying to write a DTrace script which does the following:
Whenever a new thread is started, increment a count.
Whenever one of these threads exits, decrement the count, and exit the script if the count is now zero.
I have something like this:
BEGIN {
threads_alive = 0;
}
proc:::lwp-start /execname == $$1/ {
self->started = timestamp;
threads_alive += 1;
}
proc:::lwp-exit /self->started/ {
threads_alive -= 1;
if (threads_alive == 0) {
exit(0);
}
}
However, this doesn't work, because threads_alive is a scalar variable and thus it is not multi-cpu safe. As a result, multiple threads will overwrite each other's changes to the variable.
I have also tried using an aggregate variable instead:
#thread_count = sum(1)
//or
#threads_entered = count();
#threads_exitted = count();
Unfortunately, I haven't found syntax to be able to do something like #thread_count == 0 or #threads_started == #threads_stopped.
DTrace doesn't have facilities for doing the kind of thread-safe data sharing you're proposing, but you have a few options depending on precisely what you're trying to do.
If the executable name is unique, you can use the proc:::start and proc:::exit probes for the start of the first thread and the exit of the last thread respectively:
proc:::start
/execname == $$1/
{
my_pid = pid;
}
proc:::exit
/pid == my_pid/
{
exit(0);
}
If you're using the -c option to dtrace, the BEGIN probe fires very shortly after the corresponding proc:::start. Internally, dtrace -c starts the specified forks the specified command and then starts tracing at one of four points: exec (before the first instruction of the new program), preinit (after ld has loaded all libraries), postinit (after each library's _init has run), or main (right before the first instruction of the program's main function, though this is not supported in macOS).
If you use dtrace -x evaltime=exec -c <program> BEGIN will fire right before the first instruction of the program executes:
# dtrace -xevaltime=exec -c /usr/bin/true -n 'BEGIN{ts = timestamp}' -n 'pid$target:::entry{printf("%dus", (timestamp - ts)/1000); exit(0); }'
dtrace: description 'BEGIN' matched 1 probe
dtrace: description 'pid$target:::entry' matched 1767 probes
dtrace: pid 1848 has exited
CPU ID FUNCTION:NAME
10 16757 _dyld_start:entry 285us
The 285us is due to the time it takes dtrace to resume the process via /proc or ptrace(2) on macOS. Rather than proc:::start or proc:::lwp-start you may be able to use BEGIN, pid$target::_dyld_start:entry, or pid$target::main:entry.

Using IO::Select on STDIN on Windows

When I run the code below on a Linux system, as expected it outputs Nothing is ready about every two seconds, and also outputs anything entered on to console.
But on Windows, can_read returns instantly with zero items.
use IO::Select;
my $sel = IO::Select->new();
$sel->add(\*STDIN);
while ( 1 ) {
my #ready = $sel->can_read(2);
if ( scalar #ready == 0 ) {
print "Nothing is ready\n";
}
foreach my $fh ( #ready ) {
if ( $fh eq \*STDIN ) {
my $in = <STDIN>;
print "got $in from stdin\n";
}
}
}
It seems that select works only on Windows sockets and not on STDIN. How can I use IO::Select on STDIN on a Windows system?
You cannot, perldoc perlport states:
select Only implemented on sockets. (Win32, VMS)
This is caused by Windows itself implementing select() only for sockets, see https://learn.microsoft.com/de-de/windows/desktop/api/winsock2/nf-winsock2-select.
The Windows equivalent seems to be I/O Completion Ports. But you have to find a way to use them from Perl.
If you really just care about STDIN, you can poll in a loop with Term::ReadKey with a ReadMode of -1 (non-blocking). As the name of the module suggests, this may only work on a tty.

Running commands in background from TCL script and formatting output

I have a tcl script which runs multiple shell commands serially.
Something like this:
abc.tcl
command 1
command 2
command 3
...
command n
This script prints the outputs of these commands into a text file in the following format:
### ### ### ### ### ###
Command name
### ### ### ### ### ###
Command Output
### ### ### ### ### ##
I was trying to get the script to run faster but making the shell commands run in parallel instead of serially. By pushing them to the background (command a &). But I'm at a loss how to retain the formatting of my output text file as was the case before.
When I push the commands in to background I'm forced to append their outputs into a temporary file, but these files just have the output of the commands in a dump together. It's difficult to differentiate between the different outputs.
Is there someway I can redirect the output of each command running in the background to an individual temp file (maybe the name of the temp file can have the process id of the background running process). And once all commands have run, I can cat the outputs together in to the proper format? Any ideas/suggestions on how I can accomplish this.
If the commands don't have state that depends on each other, you can parallelize them. There are many ways to do this, but one of the easier is to use the thread package's thread pooling (which requires a threaded Tcl, the norm on many platform nowadays):
package require Thread
set pool [tpool::create -maxworkers 4]
# The list of *scripts* to evaluate
set tasks {
{command 1}
{command 2}
...
{command n}
}
# Post the work items (scripts to run)
foreach task $tasks {
lappend jobs [tpool::post $pool $task]
}
# Wait for all the jobs to finish
for {set running $jobs} {[llength $running]} {} {
tpool::wait $pool $running running
}
# Get the results; you might want a different way to print the results...
foreach task $tasks job $jobs {
set jobResult [tpool::get $pool $job]
puts "TASK: $task"
puts "RESULT: $jobResult"
}
The main tweakable is the size of the thread pool, which defaults to a limit of 4. (Set it via the -maxworkers option to tpool::create which I've listed explicitly above.) The best value to choose depends on how many CPU cores you've got and how much CPU load each task generates on average; you'll need to measure and tune…
You can also use the -initcmd option to pre-load each worker thread in the pool with a script of your choice. That's a good place to put your package require calls. The workers are all completely independent of each other and of the master thread; they do not share state. You'd get the same model if you ran each piece of code in a separate process (but then you'd end up writing more code to do the coordinating).
[EDIT]: Here's a version that will work with Tcl 8.4 and which uses subprocesses instead.
namespace eval background {}
proc background::task {script callback} {
set f [open |[list [info nameofexecutable]] "r+"]
fconfigure $f -buffering line
puts $f [list set script $script]
puts $f {fconfigure stdout -buffering line}
puts $f {puts [list [catch $script msg] $msg]; exit}
fileevent $f readable [list background::handle $f $script $callback]
}
proc background::handle {f script callback} {
foreach {code msg} [read $f] break
catch {close $f}
uplevel "#0" $callback [list $script $code $msg]
}
proc accumulate {script code msg} {
puts "#### COMMANDS\n$script"
puts "#### CODE\n$code"
puts "#### RESULT\n$msg"
# Some simple code to collect the results
if {[llength [lappend ::accumulator $msg]] == 3} {
set ::done yes
}
}
foreach task {
{after 1000;subst hi1}
{after 2000;subst hi2}
{after 3000;subst hi3}
} {
background::task $task accumulate
}
puts "WAITING FOR TASKS..."
vwait done
Notes: the tasks are Tcl commands that produce a result, but they must not print the result out; the fabric code (in background::task) handles that. These are subprocesses; they share nothing with one another, so anything you want them to do or be configured with must be sent as part of the task. A more sophisticated version could keep a hot pool of subprocesses around and in general work very much like a thread pool (subject to the subtle differences due to being in a subprocess and not a thread) but that was more code than I wanted to write here.
Result codes (i.e., exception codes) are 0 for “ok”, 1 for “error”, and other values in less common cases. They're exactly the values documented on the Tcl 8.6 catch manual page; it's up to you to interpret them correctly. (I suppose I should also add code to make the ::errorInfo and ::errorCode variable contents be reported back in the case of an error, but that makes the code rather more complex…)

Same command, with different parameters, on a while true loop with bash or something else

I always become crazy with bash, i don't understand it.
I basically want to do this (i'm not using some specific syntax, it's just to explain my problem):
processes_count = 20;
for (i = 0; i < processes_count; i++)
{
php -f file.php "{$i}-{$processes_count}" &
proc_id[i] = $!
}
The above cycle start the processes. The next one should keep the processes "alive for ever"!
while(true)
{
foreach(proc_id as id)
{
if(!exist(proc_id[id]))
{
php -f file.php "{$id}-{$processes_count}" &
proc_id[id] = $!
}
}
sleep 5
}
If someone can help translating this into bash, python or something, thank you :)
I don't think you can do that because bash doesn't provide a method to 'wait for any one child process to die and let me know which one it was that died'. The nearest approach is wait:
wait
wait [jobspec or pid ...]
Wait until the child process specified by each process id pid or job specification
jobspec exits and return the exit status of the last command waited for. If a
job spec is given, all processes in the job are waited for. If no arguments are
given, all currently active child processes are waited for, and the return status
is zero. If neither jobspec nor pid specifies an active child process of the shell,
the return status is 127.
This means you can wait for a specific child to die, or you can wait for all children to die, but you can't do what you want.
If you drop into Perl or Python, you can do it, using the wait system call.

How can I get forking pipes to work in Perl on Windows?

I'm trying to port a Perl script over from Unix to Windows but am having a near impossible time getting it to work due to the unsupported forking pipes in the open function. Here's the code:
sub p4_get_file_content {
my $filespec = shift;
return 'Content placeholder!' if ($options{'dry-run'});
debug("p4_get_file_content: $filespec\n");
local *P4_OUTPUT;
local $/ = undef;
my $pid = open(P4_OUTPUT, "-|");
die "Fork failed: $!" unless defined $pid;
if ($pid == 0) { # child
my $p4 = p4_init();
my $result = undef;
$result = $p4->Run('print', $filespec);
die $p4->Errors() if $p4->ErrorCount();
if (ref $result eq 'ARRAY') {
for (my $i = 1; $i < #$result; $i++) {
print $result->[$i];
}
}
$p4->Disconnect();
exit 0;
}
my $content = <P4_OUTPUT>;
close(P4_OUTPUT) or die "Close failed: ($?) $!";
return $content;
}
The error is:
'-' is not recognized as an internal or external command,
operable program or batch file.
Does anyone know how to make this work? Thanks!
Mike
I know it's not a direct answer to your question, but it looks like you're scripting something on top of Perforce in Perl? If so you might find an existing library does what you want already and save yourself a lot of headaches, or at least give you some sample code to work from.
For example:
P4Perl
P4::Server
P4::C4
EDIT: Now that I know what you're doing I'm guessing you're trying to port p42svn to Windows, or rather make it compatible with Windows at least. See this thread for a discussion of this exact issue. The recommendation (untested) is to try the code samples listed at http://perldoc.perl.org/perlfork.html under "Forking pipe open() not yet implemented" to explicitly create the pipe instead.
It's not going to work as-is. You'll need to find another method to accomplish what it's doing. It doesn't look like there's that burning a need for the fork-pipe, but it's hard to tell since I don't know what a p4 is and a lot of your code is being lost to angle bracket interpretation.

Resources