Why is IO::WaitReadable being raised differently for STDOUT than STDERR? - ruby

Given that I wish to test non-blocking reads from a long command, I created the following script, saved it as long, made it executable with chmod 755, and placed it in my path (saved as ~/bin/long where ~/bin is in my path).
I am on a *nix variant with ruby 1.9.2p290 (2011-07-09 revision 32553) [x86_64-darwin11.0.0] compiled with RVM defaults. I do not use Windows, and am therefore unsure if the test script will work for you if you do.
#!/usr/bin/env ruby
3.times do
STDOUT.puts 'message on stdout'
STDERR.puts 'message on stderr'
sleep 1
end
Why does long_err produce each STDERR message as it is printed by "long"
def long_err( bash_cmd = 'long', maxlen = 4096)
stdin, stdout, stderr = Open3.popen3(bash_cmd)
begin
begin
puts 'err -> ' + stderr.read_nonblock(maxlen)
end while true
rescue IO::WaitReadable
IO.select([stderr])
retry
rescue EOFError
puts 'EOF'
end
end
while long_out remains blocked until all STDOUT messages are printed?
def long_out( bash_cmd = 'long', maxlen = 4096)
stdin, stdout, stderr = Open3.popen3(bash_cmd)
begin
begin
puts 'out -> ' + stdout.read_nonblock(maxlen)
end while true
rescue IO::WaitReadable
IO.select([stdout])
retry
rescue EOFError
puts 'EOF'
end
end
I assume you will require 'open3' before testing either function.
Why is IO::WaitReadable being raised differently for STDOUT than STDERR?
Workarounds using other ways to start subprocesses also appreciated if you have them.

In most OS's STDOUT is buffered while STDERR is not. What popen3 does is basically open a pipe between the exeutable you launch and Ruby.
Any output that is in buffered mode is not sent through this pipe until either:
The buffer is filled (thereby forcing a flush).
The sending application exits (EOF is reached, forcing a flush).
The stream is explicitly flushed.
The reason STDERR is not buffered is that it's usually considered important for error messages to appear instantly, rather than go for for efficiency through buffering.
So, knowing this, you can emulate STDERR behaviour with STDOUT like this:
#!/usr/bin/env ruby
3.times do
STDOUT.puts 'message on stdout'
STDOUT.flush
STDERR.puts 'message on stderr'
sleep 1
end
and you will see the difference.
You might also want to check "Understanding Ruby and OS I/O buffering".

Here's the best I've got so far for starting subprocesses. I launch a lot of network commands so I needed a way to time them out if they take too long to come back. This should be handy in any situation where you want to remain in control of your execution path.
I adapted this from a Gist, adding code to test the exit status of the command for 3 outcomes:
Successful completion (exit status 0)
Error completion (exit status is non-zero) - raises an exception
Command timed out and was killed - raises an exception
Also fixed a race condition, simplified parameters, added a few more comments, and added debug code to help me understand what was happening with exits and signals.
Call the function like this:
output = run_with_timeout("command that might time out", 15)
output will contain the combined STDOUT and STDERR of the command if it completes successfully. If the command doesn't complete within 15 seconds it will be killed and an exception raised.
Here's the function (2 constants you'll need defined at the top):
DEBUG = false # change to true for some debugging info
BUFFER_SIZE = 4096 # in bytes, this should be fine for many applications
def run_with_timeout(command, timeout)
output = ''
tick = 1
begin
# Start task in another thread, which spawns a process
stdin, stderrout, thread = Open3.popen2e(command)
# Get the pid of the spawned process
pid = thread[:pid]
start = Time.now
while (Time.now - start) < timeout and thread.alive?
# Wait up to `tick' seconds for output/error data
Kernel.select([stderrout], nil, nil, tick)
# Try to read the data
begin
output << stderrout.read_nonblock(BUFFER_SIZE)
puts "we read some data..." if DEBUG
rescue IO::WaitReadable
# No data was ready to be read during the `tick' which is fine
print "." # give feedback each tick that we're waiting
rescue EOFError
# Command has completed, not really an error...
puts "got EOF." if DEBUG
# Wait briefly for the thread to exit...
# We don't want to kill the process if it's about to exit on its
# own. We decide success or failure based on whether the process
# completes successfully.
sleep 1
break
end
end
if thread.alive?
# The timeout has been reached and the process is still running so
# we need to kill the process, because killing the thread leaves
# the process alive but detached.
Process.kill("TERM", pid)
end
ensure
stdin.close if stdin
stderrout.close if stderrout
end
status = thread.value # returns Process::Status when process ends
if DEBUG
puts "thread.alive?: #{thread.alive?}"
puts "status: #{status}"
puts "status.class: #{status.class}"
puts "status.exited?: #{status.exited?}"
puts "status.exitstatus: #{status.exitstatus}"
puts "status.signaled?: #{status.signaled?}"
puts "status.termsig: #{status.termsig}"
puts "status.stopsig: #{status.stopsig}"
puts "status.stopped?: #{status.stopped?}"
puts "status.success?: #{status.success?}"
end
# See how process ended: .success? => true, false or nil if exited? !true
if status.success? == true # process exited (0)
return output
elsif status.success? == false # process exited (non-zero)
raise "command `#{command}' returned non-zero exit status (#{status.exitstatus}), see below output\n#{output}"
elsif status.signaled? # we killed the process (timeout reached)
raise "shell command `#{command}' timed out and was killed (timeout = #{timeout}s): #{status}"
else
raise "process didn't exit and wasn't signaled. We shouldn't get to here."
end
end
Hope this is useful.

Related

Kill a process called using open3 in ruby

I'm using a command line program, it works as mentioned below:
$ ROUTE_TO_FOLDER/app < "long text"
If "long text" is written using the parameters "app" needs, then it will fill a text file with results. If not, it will fill the text file with dots continuously (I can't handle or modify the code of "app" in order to avoid this).
In a ruby script there's a line like this:
text = "long text that will be used by app"
output = system("ROUTE_TO_FOLDER/app < #{text}")
Now, if text is well written, there won't be problems and I will get an output file as mentioned before. The problem comes when text is not well written. What happens next is that my ruby script hangs and I'm not sure how to kill it.
I've found Open3 and I've used the method like this:
irb> cmd = "ROUTE_TO_FOLDER/app < #{text}"
irb> stdin, stdout, stderr, wait_thr = Open3.popen3(cmd)
=> [#<IO:fd 10>, #<IO:fd 11>, #<IO:fd 13>, #<Thread:0x007f3a1a6f8820 run>]
When I do:
irb> wait_thr.value
it also hangs, and :
irb> wait_thr.status
=> "sleep"
How can I avoid these problems? Is it not recognizing that "app" has failed?
wait_thr.pid provides you the pid of the started process. Just do
Process.kill("KILL",wait_thr.pid)
when you need to kill it.
You can combine it with detecting if the process is hung (continuously outputs dots) in one of the two ways.
1) Set a timeout for waiting for the process:
get '/process' do
text = "long text that will be used by app"
cmd = "ROUTE_TO_FOLDER/app < #{text}"
Open3.popen3(cmd) do |i,o,e,w|
begin
Timeout.timeout(10) do # timeout set to 10 sec, change if needed
# process output of the process. it will produce EOF when done.
until o.eof? do
# o.read_nonblock(N) ...
end
end
rescue Timeout::Error
# here you know that the process took longer than 10 seconds
Process.kill("KILL", w.pid)
# do whatever other error processing you need
end
end
end
2) Check the process output. (The code below is simplified - you probably don't want to read the output of your process into a single String buf first and then process, but I guess you get the idea).
get '/process' do
text = "long text that will be used by app"
cmd = "ROUTE_TO_FOLDER/app < #{text}"
Open3.popen3(cmd) do |i,o,e,w|
# process output of the process. it will produce EOF when done.
# If you get 16 dots in a row - the process is in the continuous loop
# (you may want to deal with stderr instead - depending on where these dots are sent to)
buf = ""
error = false
until o.eof? do
buf << o.read_nonblock(16)
if buf.size>=16 && buf[-16..-1] == '.'*16
# ok, the process is hung
Process.kill("KILL", w.pid)
error = true
# you should also get o.eof? the next time you check (or after flushing the pipe buffer),
# so you will get out of the until o.eof? loop
end
end
if error
# do whatever error processing you need
else
# process buf, it contains all the output
end
end
end

Double fork and stdin

I wrote this code to run my process in a daemon. The goal is to make this process running even if I close its parent. Now, i would like to be able to write something in its stdin. What should I do ? Here's the code.
def daemonize(cmd, options = {})
rd, wr = IO.pipe
p1 = Process.fork {
Process.setsid
p2 = Process.fork {
$0 = cmd #Name of the command
pidfile = File.new(options[:pid_file], 'w')
pidfile.chmod( 0644 )
pidfile.puts "#{Process.pid}"
pidfile.close
Dir.chdir(ENV["PWD"] = options[:working_dir].to_s) if options[:working_dir]
File.umask 0000
STDIN.reopen '/dev/null'
STDOUT.reopen '/dev/null', 'a'
STDERR.reopen STDOUT
Signal.trap("USR1") do
Console.show 'I just received a USR1', 'warning'
end
::Kernel.exec(*Shellwords.shellwords(cmd)) #Executing the command in the parent process
exit
}
raise 'Fork failed!' if p2 == -1
Process.detach(p2) # divorce p2 from parent process (p1)
rd.close
wr.write p2
wr.close
exit
}
raise 'Fork failed!' if p1 == -1
Process.detach(p1) # divorce p1 from parent process (shell)
wr.close
daemon_id = rd.read.to_i
rd.close
daemon_id
end
Is there a way to reopen stdin in something like a pipe instead of /dev/null in which I would be able to write ?
How about a fifo? In linux, you can use the mkfifo command:
$ mkfifo /tmp/mypipe
Then you can reopen STDIN on that pipe:
STDIN.reopen '/tmp/mypipe'
# Do read-y things
Anything else can write to that pipe:
$ echo "roflcopter" > /tmp/mypipe
allowing that data to be read by the ruby process.
(Update) Caveat with blocking
Since fifos block until there's a read and write (e.g. a read is blocked unless there's a write, and vice-versa), it's best handled with multiple threads. One thread should do the reading, passing the data to a queue, and another should handle that input. Here's an example of that situation:
require 'thread'
input = Queue.new
threads = []
# Read from the fifo and add to an input queue (glorified array)
threads << Thread.new(input) do |ip|
STDIN.reopen 'mypipe'
loop do
if line = STDIN.gets
puts "Read: #{line}"
ip.push line
end
end
end
# Handle the input passed by the reader thread
threads << Thread.new(input) do |ip|
loop do
puts "Ouput: #{ip.pop}"
end
end
threads.map(&:join)

How do I see the output of multiple forked processes simultaneously

I am using the "fork" option in ruby as follows:
pid1 = fork do
pid1_output = `ruby scrape1.rb`
puts "#{pid1_output}"
puts ""
exit
end
pid2 = fork do
pid2_output = `ruby scrape2.rb`
puts "#{pid2_output}"
puts ""
exit
end
pid3 = fork do
pid3_output = `ruby scrape3.rb`
puts "#{pid3_output}"
puts ""
exit
end
pid4 = fork do
pid4_output = `ruby scrape4.rb`
puts "#{pid4_output}"
puts ""
exit
end
Process.waitall
The problem here is that sometimes one of the processes (eg: ruby scrape1.rb) might fail or end up returning ginormous amounts of text that cannot be captured in a variable... How do I still simultaneously run 4 processes and see all their outputs in one terminal window in realtime? I understand the order of output might be mushed up but that is alright.. I basically want to re-route the STDOUT and STDERR of each forked process to the main program.. That way I can see what is being scraped by each of my scrapers and follow their progress and errors as they happen.
fork do
exec("ruby scrape1.rb")
end
fork do
exec("ruby scrape2.rb")
end
fork do
exec("ruby scrape3.rb")
end
fork do
exec("ruby scrape4.rb")
end
Process.waitall

Change STDIN with a pipe and it's a directory

I have this
pipe_in, pipe_out = IO.pipe
fork do
# child 1
pipe_in.close
STDOUT.reopen pipe_out
STDERR.reopen pipe_out
puts "Hello World"
pipe_out.close
end
fork do
# child 2
pipe_out.close
STDIN.reopen pipe_in
while line = gets
puts 'child2:' + line
end
pipe_in.close
end
Process.wait
Process.wait
get will always raise an error saying "gets: Is a directory", which doesn't make sense to me. If I change gets to pipe_in.gets it works. What I want to know is, why doesn't STDIN.reopen pipe_in and gets not work?
It works for me, with the following change:
pipe_in.close
end
+pipe_in.close
+pipe_out.close
+
Process.wait
Process.wait
Without this change, you still have the pipes open in the original process, so the reader will never see an end of file. That is, process doing the wait still had the write pipe open leading to a deadlock.

How to proxy a shell process in ruby

I'm creating a script to wrap jdb (java debugger). I essentially want to wrap this process and proxy the user interaction. So I want it to:
start jdb from my script
send the output of jdb to stdout
pause and wait for input when jdb does
when the user enters commands, pass it to jdb
At the moment I really want a pass thru to jdb. The reason for this is to initialize the process with specific parameters and potentially add more commands in the future.
Update:
Here's the shell of what ended up working for me using expect:
PTY.spawn("jdb -attach 1234") do |read,write,pid|
write.sync = true
while (true) do
read.expect(/\r\r\n> /) do |s|
s = s[0].split(/\r\r\n/)
s.pop # get rid of prompt
s.each { |line| puts line }
print '> '
STDOUT.flush
write.print(STDIN.gets)
end
end
end
Use Open3.popen3(). e.g.:
Open3.popen3("jdb args") { |stdin, stdout, stderr|
# stdin = jdb's input stream
# stdout = jdb's output stream
# stderr = jdb's stderr stream
threads = []
threads << Thread.new(stderr) do |terr|
while (line = terr.gets)
puts "stderr: #{line}"
end
end
threads << Thread.new(stdout) do |terr|
while (line = terr.gets)
puts "stdout: #{line}"
end
end
stdin.puts "blah"
threads.each{|t| t.join()} #in order to cleanup when you're done.
}
I've given you examples for threads, but you of course want to be responsive to what jdb is doing. The above is merely a skeleton for how you open the process and handle communication with it.
The Ruby standard library includes expect, which is designed for just this type of problem. See the documentation for more information.

Resources