Handling zombie processes when using waitpid - ruby

I have the following method, the idea is to run a shell command and both stream output to stdout as its recieved and store the information as a variable so I can return a hash of the information, I found no standard way of doing this (you either get streaming or captured output).
It does this by creating a forks to stream the output and append to an IO pipe that I can read in at a later date.
def self.run_cmd(cmd)
stdout_rd, stdout_wr = IO.pipe
stderr_rd, stderr_wr = IO.pipe
status = Open4::popen4(cmd) do |_pid, _stdin, _stdout, _stderr|
pids = []
pids << fork do
_stdout.each_line do |l|
print l
stdout_wr.puts l
end
end
pids << fork do
_stderr.each_line do |l|
print l
stderr_wr.puts l
end
end
pids.each{|pid| Process.waitpid(pid)}
end
stdout_wr.close
stderr_wr.close
out = stdout_rd.gets
out = '' if out.nil?
err = stderr_rd.gets
err = '' if err.nil?
{ stdout: out, stderr: err, status: status.exitstatus }
end
This works great in most scenarios but specifically unzip doesn't play well with this approach, what happens is after a fixed amount of output from zip it will stall at stdout_wr.puts l
I've observed that when the ruby process has stalled that a zombie unzip is visible when running ps
Is there any way I can make this work?
Is there a better way of doing this? I appreciate that its a complex solution and it must be easier.
My potential idea is that my IO pipe is running out of buffered space but I'm able to print 10,000 lines of output without issue.

Related

Ruby: intercept popen system call and log stdout and stderr to same file

In ruby code I am running a system call with Open3.popen3 and using the resultant IO for stdout and stderr to do some log message formatting before writing to one log file. I was wondering what would be the best way to do this so log messages will maintain the correct order, note I need to do separate formatting for error messages as for stdout messages.
Here's my current code (Assume logger is thread safe)
Open3.popen3("my_custom_script with_some_args") do |_in, stdout, stderr|
stdout_thr = Thread.new do
while line = stdout.gets.chomp
logger.info(format(:info, line))
end
end
stderr_thr = Thread.new do
while line = stderr.gets.chomp
logger.error(format(:error, line))
end
end
[stdout_thr, stderr_thr].each(&:join)
end
This has worked for me so far, but I'm not so confident that I can guarantee the correct order of the log messages. Is there a better way?
What you're trying to achieve is not possible with a guarantee. First thing to note is that your code could only possibly order based on the time that the data was received, not when it was produced, which is not quite the same. The only way to guarantee this would be to do something on the source which will add some guaranteed ordering between the two systems.
The below code should make it "more likely" to be correct by removing the threads. Assuming that you're using MRI, the threads are "green" so technically can't be running at the same time. That means you're beholden upon the scheduler choosing to run your thread at the "right" time.
Open3.popen3("my_custom_script with_some_args") do |_in, stdout, stderr|
for_reading = [stdout, stderr]
until(for_reading.empty?) do
wait_timeout = 1
# IO.select blocks until one of the streams is has something to read
# or the wait timeout is reached
readable, _writable, errors = IO.select(for_reading, [], [], wait_timeout)
# readable is nil in the case of a timeout - loop back again
if readable.nil?
Thread.pass
else
# In the case that both streams are readable (and thus have content)
# read from each of them. In this case, we cannot guarantee any order
# because we recieve the items at essentially the same time.
# We can still ensure that we don't mix data incorrectly.
readable.each do |stream|
buffer = ''
# loop through reading data until there is an EOF (value is nil)
# or there is no more data to read (value is empty)
while(true) do
tmp = stream.read_nonblock(4096, buffer, exception: false)
if tmp.nil?
# stream is EOF - nothing more to read on that one..
for_reading -= [stream]
break
elsif tmp.empty? || tmp == :wait_readable
# nothing more to read right now...
# continue on to process the buffer into lines and log them
break
end
end
if stream == stdout
buffer.split("\n").each { |line| logger.info(format(:info, line)) }
elsif stream == stderr
buffer.split("\n").each { |line| logger.info(format(:error, line)) }
end
end
end
end
end
Note that in a system generating a lot of output in a very short period of time there is more likely to be an overlap where things get out of order. This likelihood increases with the amount time taken to read the stream and process it. It would be best to ensure that the absolute minimum processing is done inside the loop. If the formatting (and writing) are expensive, consider moving those items into a separate thread reading from a single queue, and have the code inside the loop only push the buffer (and source identifier) onto the queue.

IO.copy_stream performance in ruby

I am trying to continously read a file in ruby (which is growing over time and needs to be processed in a separate process). Currently I am archiving this with the following bit of code:
r,w = IO.pipe
pid = Process.spawn('ffmpeg'+ffmpeg_args, {STDIN => r, STDERR => STDOUT})
Process.detach pid
while true do
IO.copy_stream(open(#options['filename']), w)
sleep 1
end
However - while working - I can't imagine that this is the most performant way of doing it. An alternative would be to use the following variation:
step = 1024*4
copied = 0
pid = Process.spawn('ffmpeg'+ffmpeg_args, {STDIN => r, STDERR => STDOUT})
Process.detach pid
while true do
IO.copy_stream(open(#options['filename']), w, step, copied)
copied += step
sleep 1
end
which would only continously copy parts of the file (the issue here being if the step should ever overreach the end of the file). Other approaches such a simple read-file led to ffmpeg failing when there was no new data. With this solution the frames are dropped if no new data is available (which is what I need).
Is there a better way (more performant) to archive something like that?
EDIT:
Using the method proposed by #RaVeN I am now using the following code:
open(#options['filename'], 'rb') do |stream|
stream.seek(0, IO::SEEK_END)
queue = INotify::Notifier.new
queue.watch(#options['filename'], :modify) do
w.write stream.read
end
queue.run
end
However now ffmpeg complaints about invalid data. Is there another method than read?

How to interactively run mount command from a (Ruby) script?

I am trying to write a Ruby script that runs the mount command interactively behind the scenes. The problem is, if I redirect input and output of the mount command to pipes, it doesn't work. Somehow, mount seems to realise that it's not talking directly to stdin/stdout and falls over. Either that, or it's a more wide-ranging problem that would affect all interactive commands; I don't know.
I want to be able to parse the output of mount, line by line, and shove answers into its input pipe when it asks questions. This shouldn't be an unreasonable expectation. Can someone help, please?
Examples:
def read_until(pipe, stop_at, timeoutsec = 10, verbose = false)
lines = []; line = ""
while result = IO.select([pipe], nil, nil, timeoutsec)
next if result.empty?
begin
c = pipe.read(1) rescue c = nil
end
break if c.nil?
line << c
break if line =~ stop_at
# Start a new line?
if line[-1] == ?\n
puts line if verbose
lines << line.strip
line = ""
end
end
return lines, line.match(stop_at)
end
cmd = "mount.ecryptfs -f /tmp/1 /tmp/2"
status = Open3::popen2e(cmd) { |i,o,t|
o.fcntl(3, 4) # Set non-blocking (this doesn't make any difference)
i.fcntl(3, 4) # Set non-blocking (this doesn't make any difference)
puts read_until(o, /some pattern/, 1, true) # Outputs [[], nil]
}
I've also tried spawn:
a, b = IO.pipe
c, d = IO.pipe
pid = spawn(cmd, :in=>a, :out=>d)
puts read_until(c, /some pattern/, 1, true) # Outputs [[], nil]
I've tried subprocess, pty and a host of other solutions - basically, if it's on Google, I've tried it. It seems that mount just knows if I'm not passing it a real shell, and deliberately blocks. See:
pid = spawn(cmd, :in=>STDIN, :out=>STDOUT) # Works
pid = spawn(cmd, :in=>somepipe, :out=>STDOUT) # Blocks after first line of output, for no reason whatsoever. It's not expecting any input at this point.
I even tried spawning a real shell (e.g. bash) and sending the mount command to it via an input pipe. Same problem.
Please ignore any obvious errors in the above: I have tried several solutions tonight, so the actual code has been rewritten many times. I wrote the above from memory.
What I want is the following:
Run mount command with arguments, getting pipes for its input and output streams
Wait for first specific question on output pipe
Answer specific question by writing to input pipe
Wait for second specific question on output pipe
...etc...
And so on.
You may find Kernel#system useful. It opens a subshell, so if you are ok w/ the user just interacting with mount directly this will make everything much easier.

Redirecting stdout/stderr of spawn() to a string in Ruby

I would like to execute an external process in Ruby using spawn (for multiple concurrent child processes) and collect the stdout or stderr into a string, in a similar way to what can be done with Python's subprocess Popen.communicate().
I tried redirecting :out/:err to a new StringIO object, but that generates an ArgumentError, and temporarily redefining $stdxxx would mix up the outputs of the child processes.
In case you don't like popen, here's my way:
r, w = IO.pipe
pid = Process.spawn(command, :out => w, :err => [:child, :out])
w.close
...
pid, status = Process.wait2
output = r.read
r.close
Anyway you can't redirect to a String object directly. You can at most direct it to an IO object and then read from that, just like the code above.
Why do you need spawn? Unless you are on Windows you can use popen*, e.g. popen4:
require "open4"
pid, p_i, p_o, p_e = Open4.popen4("ls")
p_i.close
o, e = [p_o, p_e].map { |p| begin p.read ensure p.close end }
s = Process::waitpid2(pid).last
From the Ruby docs it seems that you can't, but you can do this:
spawn("ls", 0 => ["/tmp/ruby_stdout_temp", "w"])
stdoutStr=File.read("/tmp/ruby_stdout_temp")
You can also do the same with standard error. Or, if you wan't to do that and don't mind popen:
io=IO.popen("ls")
stdout=io.read
The most simple and straightforward way seems
require 'open3'
out, err, ps = Open3.capture3("ls")
puts "Process failed with status #{ps.exitstatus}" unless ps.success?
Here we have the outputs as strings.

How do I block on reading a named pipe in Ruby?

I'm trying to set up a Ruby script that reads from a named pipe in a loop, blocking until input is available in the pipe.
I have a process that periodically puts debugging events into a named pipe:
# Open the logging pipe
log = File.open("log_pipe", "w+") #'log_pipe' created in shell using mkfifo
...
# An interesting event happens
log.puts "Interesting event #4291 occurred"
log.flush
...
I then want a separate process that will read from this pipe and print events to the console as they happen. I've tried using code like this:
input = File.open("log_pipe", "r+")
while true
puts input.gets #I expect this to block and wait for input
end
# Kill loop with ctrl+c when done
I want the input.gets to block, waiting patiently until new input arrives in the fifo; but instead it immediately reads nil and loops again, scrolling off the top of the console window.
Two things I've tried:
I've opened the input fifo with both "r" and "r+"--I have the same problem either way;
I've tried to determine if my writing process is sending EOF (which I've heard will cause the read fifo to close)--AFAIK it isn't.
SOME CONTEXT:
If it helps, here's a 'big picture' view of what I'm trying to do:
I'm working on a game that runs in RGSS, a Ruby based game engine. Since it doesn't have good integrated debugging, I want to set up a real-time log as the game runs--as events happen in the game, I want messages to show up in a console window on the side. I can send events in the Ruby game code to a named pipe using code similar to the writer code above; I'm now trying to set up a separate process that will wait for events to show up in the pipe and show them on the console as they arrive. I'm not even sure I need Ruby to do this, but it was the first solution I could think of.
Note that I'm using mkfifo from cygwin, which I happened to have installed anyway; I wonder if that might be the source of my trouble.
If it helps anyone, here's exactly what I see in irb with my 'reader' process:
irb(main):001:0> input = File.open("mypipe", "r")
=> #<File:mypipe>
irb(main):002:0> x = input.gets
=> nil
irb(main):003:0> x = input.gets
=> nil
I don't expect the input.gets at 002 and 003 to return immediately--I expect them to block.
I found a solution that avoids using Cygwin's unreliable named pipe implementation entirely. Windows has its own named pipe facility, and there is even a Ruby Gem called win32-pipe that uses it.
Unfortunately, there appears to be no way to use Ruby Gems in an RGSS script; but by dissecting the win32-pipe gem, I was able to incorporate the same idea into an RGSS game. This code is the bare minimum needed to log game events in real time to a back channel, but it can be very useful for deep debugging.
I added a new script page right before 'Main' and added this:
module PipeLogger
# -- Change THIS to change the name of the pipe!
PIPE_NAME = "RGSSPipe"
# Constant Defines
PIPE_DEFAULT_MODE = 0 # Pipe operation mode
PIPE_ACCESS_DUPLEX = 0x00000003 # Pipe open mode
PIPE_UNLIMITED_INSTANCES = 255 # Number of concurrent instances
PIPE_BUFFER_SIZE = 1024 # Size of I/O buffer (1K)
PIPE_TIMEOUT = 5000 # Wait time for buffer (5 secs)
INVALID_HANDLE_VALUE = 0xFFFFFFFF # Retval for bad pipe handle
#-----------------------------------------------------------------------
# make_APIs
#-----------------------------------------------------------------------
def self.make_APIs
$CreateNamedPipe = Win32API.new('kernel32', 'CreateNamedPipe', 'PLLLLLLL', 'L')
$FlushFileBuffers = Win32API.new('kernel32', 'FlushFileBuffers', 'L', 'B')
$DisconnectNamedPipe = Win32API.new('kernel32', 'DisconnectNamedPipe', 'L', 'B')
$WriteFile = Win32API.new('kernel32', 'WriteFile', 'LPLPP', 'B')
$CloseHandle = Win32API.new('kernel32', 'CloseHandle', 'L', 'B')
end
#-----------------------------------------------------------------------
# setup_pipe
#-----------------------------------------------------------------------
def self.setup_pipe
make_APIs
##name = "\\\\.\\pipe\\" + PIPE_NAME
##pipe_mode = PIPE_DEFAULT_MODE
##open_mode = PIPE_ACCESS_DUPLEX
##pipe = nil
##buffer = 0.chr * PIPE_BUFFER_SIZE
##size = 0
##bytes = [0].pack('L')
##pipe = $CreateNamedPipe.call(
##name,
##open_mode,
##pipe_mode,
PIPE_UNLIMITED_INSTANCES,
PIPE_BUFFER_SIZE,
PIPE_BUFFER_SIZE,
PIPE_TIMEOUT,
0
)
if ##pipe == INVALID_HANDLE_VALUE
# If we could not open the pipe, notify the user
# and proceed quietly
print "WARNING -- Unable to create named pipe: " + PIPE_NAME
##pipe = nil
else
# Prompt the user to open the pipe
print "Please launch the RGSSMonitor.rb script"
end
end
#-----------------------------------------------------------------------
# write_to_pipe ('msg' must be a string)
#-----------------------------------------------------------------------
def self.write_to_pipe(msg)
if ##pipe
# Format data
##buffer = msg
##size = msg.size
$WriteFile.call(##pipe, ##buffer, ##buffer.size, ##bytes, 0)
end
end
#------------------------------------------------------------------------
# close_pipe
#------------------------------------------------------------------------
def self.close_pipe
if ##pipe
# Send kill message to RGSSMonitor
##buffer = "!!GAMEOVER!!"
##size = ##buffer.size
$WriteFile.call(##pipe, ##buffer, ##buffer.size, ##bytes, 0)
# Close down the pipe
$FlushFileBuffers.call(##pipe)
$DisconnectNamedPipe.call(##pipe)
$CloseHandle.call(##pipe)
##pipe = nil
end
end
end
To use this, you only need to make sure to call PipeLogger::setup_pipe before writing an event; and call PipeLogger::close_pipe before game exit. (I put the setup call at the start of 'Main', and add an ensure clause to call close_pipe.) After that, you can add a call to PipeLogger::write_to_pipe("msg") at any point in any script with any string for "msg" and write into the pipe.
I have tested this code with RPG Maker XP; it should also work with RPG Maker VX and later.
You will also need something to read FROM the pipe. There are any number of ways to do this, but a simple one is to use a standard Ruby installation, the win32-pipe Ruby Gem, and this script:
require 'rubygems'
require 'win32/pipe'
include Win32
# -- Change THIS to change the name of the pipe!
PIPE_NAME = "RGSSPipe"
Thread.new { loop { sleep 0.01 } } # Allow Ctrl+C
pipe = Pipe::Client.new(PIPE_NAME)
continue = true
while continue
msg = pipe.read.to_s
puts msg
continue = false if msg.chomp == "!!GAMEOVER!!"
end
I use Ruby 1.8.7 for Windows and the win32-pipe gem mentioned above (see here for a good reference on installing gems). Save the above as "RGSSMonitor.rb" and invoke it from the command line as ruby RGSSMonitor.rb.
CAVEATS:
The RGSS code listed above is fragile; in particular, it does not handle failure to open the named pipe. This is not usually an issue on your own development machine, but I would not recommend shipping this code.
I haven't tested it, but I suspect you'll have problems if you write a lot of things to the log without running a process to read the pipe (e.g. RGSSMonitor.rb). A Windows named pipe has a fixed size (I set it here to 1K), and by default writes will block once the pipe is filled (because no process is 'relieving the pressure' by reading from it). Unfortunately, the RPGXP engine will kill a Ruby script that has stopped running for 10 seconds. (I'm told that RPGVX has eliminated this watchdog function--in which case, the game will hang instead of abruptly terminating.)
What's probably happening is the writing process is exiting, and as there are no other writing processes, EOF is sent to the pipe which causes gets to return nil, and so your code loops continually.
To get around this you can usually just open the pipe read-write at the reader end. This works for me (on a Mac), but isn't working for you (you've tried "r" and "r+"). I'm guessing this is to due with Cygwin (POSIX says opening a FIFO read-write is undefined).
An alternative is to open the pipe twice, once read-only and once write-only. You don't use the write-only IO for anything, it's just so that there's always an active writer attached to the pipe so it doesn't get closed.
input = File.open("log_pipe", "r") # note 'r', not 'r+'
keep_open = File.open("log_pipe", "w") # ensure there's always a writer
while true
puts input.gets
end

Resources