Continuously read from STDOUT and STDERR in real time - ruby

Say we have a small application (example.rb) which behave like the following ruby code:
#!/usr/bin/env ruby
done = false
out =Thread.new do
cnt = 0
while !done
STDOUT.puts "out #{cnt}"
cnt += 1
sleep 1
end
end
err = Thread.new do
cnt = 0
while !done
STDERR.puts "err #{cnt}"
cnt += 1
sleep 1
end
end
while true
i = STDIN.gets
if i == "q\n"
puts "Quiting"
done = true
break
end
end
out.join
err.join
exit 42
It print something to stdout and to stderr, it must be quited by writing "q\n" to stdin, and when it exit a value is returned in the return code.
Now, I would like to write a small ruby script which can run this program in an external process, where stdout and stdin are captured, and when the external process should be terminated this is done by writing "q\n" to its stdin. This program is called monitor.rb.
This is what I have tried here:
#!/usr/bin/env ruby
require 'open3'
class Monitor
#cmd
attr_accessor :return_code
def initialize cmd
#cmd = cmd
end
def run
#runner_thread = Thread.new do
Open3::popen3(#cmd) do |stdin, stdout, stderr, thread|
puts "#{Time.now} #{#cmd} is running as pid: #{thread.pid}"
stdin.sync = true;
stdout.sync = true;
stderr.sync = true;
#stdin = stdin
t_out = Thread.new do
stdout.readlines do |l|
puts "#{Time.now} STDOUT> #{l}"
end
end
t_err = Thread.new do
stderr.readlines do |l|
puts "#{Time.now} STDERR> #{l}"
end
end
thread.join
t_err.join
t_out.join
#return_code = thread.value
end
end
end
def quit
puts "Quiting"
#stdin.puts "q"
#stdin.close
#runner_thread.join
end
end
mon = Monitor.new "./example.rb"
mon.run
sleep 5
mon.quit
puts "Return code: #{mon.return_code}"
Question 1: What is wrong with my code since the output of the external process is not being printed?
Question 2: Can this be done in a more elegant way, and what would that look like?
The code must be able to run on Linux and portability is not a priority, I uses ruby 2.0.
When run example.rb in a terminal I get:
$ ./example.rb
out 0
err 0
out 1
err 1
out 2
err 2
q
Quiting
When I run the monitor application I get:
$ ./monitor.rb
2013-11-19 14:39:20 +0100 ./example.rb is running as pid: 7228
Quiting
Return code: pid 7228 exit 42
I expected the monitor.rb to print the output from example.rb

Try changing your t_out and t_err threads to use the following code. readlines will read the entire file at once and stdout and stderr will block until your script exits. I think this is why you were not getting any output.
while l = stdout.gets
puts "#{Time.now} STDOUT> #{l}"
end
This should print out to the screen as soon as any output is available.

Related

Double fork and stdin

I wrote this code to run my process in a daemon. The goal is to make this process running even if I close its parent. Now, i would like to be able to write something in its stdin. What should I do ? Here's the code.
def daemonize(cmd, options = {})
rd, wr = IO.pipe
p1 = Process.fork {
Process.setsid
p2 = Process.fork {
$0 = cmd #Name of the command
pidfile = File.new(options[:pid_file], 'w')
pidfile.chmod( 0644 )
pidfile.puts "#{Process.pid}"
pidfile.close
Dir.chdir(ENV["PWD"] = options[:working_dir].to_s) if options[:working_dir]
File.umask 0000
STDIN.reopen '/dev/null'
STDOUT.reopen '/dev/null', 'a'
STDERR.reopen STDOUT
Signal.trap("USR1") do
Console.show 'I just received a USR1', 'warning'
end
::Kernel.exec(*Shellwords.shellwords(cmd)) #Executing the command in the parent process
exit
}
raise 'Fork failed!' if p2 == -1
Process.detach(p2) # divorce p2 from parent process (p1)
rd.close
wr.write p2
wr.close
exit
}
raise 'Fork failed!' if p1 == -1
Process.detach(p1) # divorce p1 from parent process (shell)
wr.close
daemon_id = rd.read.to_i
rd.close
daemon_id
end
Is there a way to reopen stdin in something like a pipe instead of /dev/null in which I would be able to write ?
How about a fifo? In linux, you can use the mkfifo command:
$ mkfifo /tmp/mypipe
Then you can reopen STDIN on that pipe:
STDIN.reopen '/tmp/mypipe'
# Do read-y things
Anything else can write to that pipe:
$ echo "roflcopter" > /tmp/mypipe
allowing that data to be read by the ruby process.
(Update) Caveat with blocking
Since fifos block until there's a read and write (e.g. a read is blocked unless there's a write, and vice-versa), it's best handled with multiple threads. One thread should do the reading, passing the data to a queue, and another should handle that input. Here's an example of that situation:
require 'thread'
input = Queue.new
threads = []
# Read from the fifo and add to an input queue (glorified array)
threads << Thread.new(input) do |ip|
STDIN.reopen 'mypipe'
loop do
if line = STDIN.gets
puts "Read: #{line}"
ip.push line
end
end
end
# Handle the input passed by the reader thread
threads << Thread.new(input) do |ip|
loop do
puts "Ouput: #{ip.pop}"
end
end
threads.map(&:join)

Run code block with Daemons gem

can somebody explain why the following code won't spawn the passed block ?
require 'daemons'
t = Daemons.call do
# This block does not start
File.open('out.log','w') do # code don't get here to open a file
|fw|
10.times {
fw.puts "=>#{rand(100)}"
sleep 1
}
end
end
#t.start # has no effect
10.times {
puts "Running ? #{t.running?}" # prints "Running ? false" 10 times
sleep 1
}
t.stop
puts 'finished'
Ruby 1.9.3p392, x86_64 Linux
You sure you're not trying to run a Thread for concurrent programming?
Here's what a Thread implementation would look like:
f = File.open('out.log', 'w')
t = Thread.new do
10.times {
f.puts "=>#{rand(100)}"
sleep 1
}
end
10.times {
puts "Running ? #{t.alive?}"
sleep 1
}
t.exit
puts 'finished'

How can I capture STDOUT to a string?

puts "hi"
puts "bye"
I want to store the STDOUT of the code so far (in this case hi \nbye into a variable say 'result' and print it )
puts result
The reason I am doing this is I have integrate an R code into my Ruby code, output of which is given to the STDOUT as the R code runs , but the ouput cannot be accessed inside the code to do some evaluations. Sorry if this is confusing. So the "puts result" line should give me hi and bye.
A handy function for capturing stdout into a string...
The following method is a handy general purpose tool to capture stdout and return it as a string. (I use this frequently in unit tests where I want to verify something printed to stdout.) Note especially the use of the ensure clause to restore $stdout (and avoid astonishment):
def with_captured_stdout
original_stdout = $stdout # capture previous value of $stdout
$stdout = StringIO.new # assign a string buffer to $stdout
yield # perform the body of the user code
$stdout.string # return the contents of the string buffer
ensure
$stdout = original_stdout # restore $stdout to its previous value
end
So, for example:
>> str = with_captured_stdout { puts "hi"; puts "bye"}
=> "hi\nbye\n"
>> print str
hi
bye
=> nil
Redirect Standard Output to a StringIO Object
You can certainly redirect standard output to a variable. For example:
# Set up standard output as a StringIO object.
foo = StringIO.new
$stdout = foo
# Send some text to $stdout.
puts 'hi'
puts 'bye'
# Access the data written to standard output.
$stdout.string
# => "hi\nbye\n"
# Send your captured output to the original output stream.
STDOUT.puts $stdout.string
In practice, this is probably not a great idea, but at least now you know it's possible.
You can do this by making a call to your R script inside backticks, like this:
result = `./run-your-script`
puts result # will contain STDOUT from run-your-script
For more information on running subprocesses in Ruby, check out this Stack Overflow question.
If activesupport is available in your project you may do the following:
output = capture(:stdout) do
run_arbitrary_code
end
More info about Kernel.capture can be found here
For most practical purposes you can put anything into $stdout that responds to write, flush, sync, sync= and tty?.
In this example I use a modified Queue from the stdlib.
class Captor < Queue
alias_method :write, :push
def method_missing(meth, *args)
false
end
def respond_to_missing?(*args)
true
end
end
stream = Captor.new
orig_stdout = $stdout
$stdout = stream
puts_thread = Thread.new do
loop do
puts Time.now
sleep 0.5
end
end
5.times do
STDOUT.print ">> #{stream.shift}"
end
puts_thread.kill
$stdout = orig_stdout
You need something like this if you want to actively act on the data and not just look at it after the task has finished. Using StringIO or a file will have be problematic with multiple threads trying to sync reads and writes simultaneously.
Capture stdout (or stderr) for both Ruby code and subprocesses
# capture_stream(stream) { block } -> String
#
# Captures output on +stream+ for both Ruby code and subprocesses
#
# === Example
#
# capture_stream($stdout) { puts 1; system("echo 2") }
#
# produces
#
# "1\n2\n"
#
def capture_stream(stream)
raise ArgumentError, 'missing block' unless block_given?
orig_stream = stream.dup
IO.pipe do |r, w|
# system call dup2() replaces the file descriptor
stream.reopen(w)
# there must be only one write end of the pipe;
# otherwise the read end does not get an EOF
# by the final `reopen`
w.close
t = Thread.new { r.read }
begin
yield
ensure
stream.reopen orig_stream # restore file descriptor
end
t.value # join and get the result of the thread
end
end
I got inspiration from Zhon.
Minitest versions:
assert_output if you need to ensure if some output is generated:
assert_output "Registrars processed: 1\n" do
puts 'Registrars processed: 1'
end
assert_output
or use capture_io if you really need to capture it:
out, err = capture_io do
puts "Some info"
warn "You did a bad thing"
end
assert_match %r%info%, out
assert_match %r%bad%, err
capture_io
Minitest itself is available in any Ruby version starting from 1.9.3
For RinRuby, please know that R has capture.output:
R.eval <<EOF
captured <- capture.output( ... )
EOF
puts R.captured
Credit to #girasquid's answer. I modified it to a single file version:
def capture_output(string)
`echo #{string.inspect}`.chomp
end
# example usage
response_body = "https:\\x2F\\x2Faccounts.google.com\\x2Faccounts"
puts response_body #=> https:\x2F\x2Faccounts.google.com\x2Faccounts
capture_output(response_body) #=> https://accounts.google.com/accounts

How do I see the output of multiple forked processes simultaneously

I am using the "fork" option in ruby as follows:
pid1 = fork do
pid1_output = `ruby scrape1.rb`
puts "#{pid1_output}"
puts ""
exit
end
pid2 = fork do
pid2_output = `ruby scrape2.rb`
puts "#{pid2_output}"
puts ""
exit
end
pid3 = fork do
pid3_output = `ruby scrape3.rb`
puts "#{pid3_output}"
puts ""
exit
end
pid4 = fork do
pid4_output = `ruby scrape4.rb`
puts "#{pid4_output}"
puts ""
exit
end
Process.waitall
The problem here is that sometimes one of the processes (eg: ruby scrape1.rb) might fail or end up returning ginormous amounts of text that cannot be captured in a variable... How do I still simultaneously run 4 processes and see all their outputs in one terminal window in realtime? I understand the order of output might be mushed up but that is alright.. I basically want to re-route the STDOUT and STDERR of each forked process to the main program.. That way I can see what is being scraped by each of my scrapers and follow their progress and errors as they happen.
fork do
exec("ruby scrape1.rb")
end
fork do
exec("ruby scrape2.rb")
end
fork do
exec("ruby scrape3.rb")
end
fork do
exec("ruby scrape4.rb")
end
Process.waitall

Why is IO::WaitReadable being raised differently for STDOUT than STDERR?

Given that I wish to test non-blocking reads from a long command, I created the following script, saved it as long, made it executable with chmod 755, and placed it in my path (saved as ~/bin/long where ~/bin is in my path).
I am on a *nix variant with ruby 1.9.2p290 (2011-07-09 revision 32553) [x86_64-darwin11.0.0] compiled with RVM defaults. I do not use Windows, and am therefore unsure if the test script will work for you if you do.
#!/usr/bin/env ruby
3.times do
STDOUT.puts 'message on stdout'
STDERR.puts 'message on stderr'
sleep 1
end
Why does long_err produce each STDERR message as it is printed by "long"
def long_err( bash_cmd = 'long', maxlen = 4096)
stdin, stdout, stderr = Open3.popen3(bash_cmd)
begin
begin
puts 'err -> ' + stderr.read_nonblock(maxlen)
end while true
rescue IO::WaitReadable
IO.select([stderr])
retry
rescue EOFError
puts 'EOF'
end
end
while long_out remains blocked until all STDOUT messages are printed?
def long_out( bash_cmd = 'long', maxlen = 4096)
stdin, stdout, stderr = Open3.popen3(bash_cmd)
begin
begin
puts 'out -> ' + stdout.read_nonblock(maxlen)
end while true
rescue IO::WaitReadable
IO.select([stdout])
retry
rescue EOFError
puts 'EOF'
end
end
I assume you will require 'open3' before testing either function.
Why is IO::WaitReadable being raised differently for STDOUT than STDERR?
Workarounds using other ways to start subprocesses also appreciated if you have them.
In most OS's STDOUT is buffered while STDERR is not. What popen3 does is basically open a pipe between the exeutable you launch and Ruby.
Any output that is in buffered mode is not sent through this pipe until either:
The buffer is filled (thereby forcing a flush).
The sending application exits (EOF is reached, forcing a flush).
The stream is explicitly flushed.
The reason STDERR is not buffered is that it's usually considered important for error messages to appear instantly, rather than go for for efficiency through buffering.
So, knowing this, you can emulate STDERR behaviour with STDOUT like this:
#!/usr/bin/env ruby
3.times do
STDOUT.puts 'message on stdout'
STDOUT.flush
STDERR.puts 'message on stderr'
sleep 1
end
and you will see the difference.
You might also want to check "Understanding Ruby and OS I/O buffering".
Here's the best I've got so far for starting subprocesses. I launch a lot of network commands so I needed a way to time them out if they take too long to come back. This should be handy in any situation where you want to remain in control of your execution path.
I adapted this from a Gist, adding code to test the exit status of the command for 3 outcomes:
Successful completion (exit status 0)
Error completion (exit status is non-zero) - raises an exception
Command timed out and was killed - raises an exception
Also fixed a race condition, simplified parameters, added a few more comments, and added debug code to help me understand what was happening with exits and signals.
Call the function like this:
output = run_with_timeout("command that might time out", 15)
output will contain the combined STDOUT and STDERR of the command if it completes successfully. If the command doesn't complete within 15 seconds it will be killed and an exception raised.
Here's the function (2 constants you'll need defined at the top):
DEBUG = false # change to true for some debugging info
BUFFER_SIZE = 4096 # in bytes, this should be fine for many applications
def run_with_timeout(command, timeout)
output = ''
tick = 1
begin
# Start task in another thread, which spawns a process
stdin, stderrout, thread = Open3.popen2e(command)
# Get the pid of the spawned process
pid = thread[:pid]
start = Time.now
while (Time.now - start) < timeout and thread.alive?
# Wait up to `tick' seconds for output/error data
Kernel.select([stderrout], nil, nil, tick)
# Try to read the data
begin
output << stderrout.read_nonblock(BUFFER_SIZE)
puts "we read some data..." if DEBUG
rescue IO::WaitReadable
# No data was ready to be read during the `tick' which is fine
print "." # give feedback each tick that we're waiting
rescue EOFError
# Command has completed, not really an error...
puts "got EOF." if DEBUG
# Wait briefly for the thread to exit...
# We don't want to kill the process if it's about to exit on its
# own. We decide success or failure based on whether the process
# completes successfully.
sleep 1
break
end
end
if thread.alive?
# The timeout has been reached and the process is still running so
# we need to kill the process, because killing the thread leaves
# the process alive but detached.
Process.kill("TERM", pid)
end
ensure
stdin.close if stdin
stderrout.close if stderrout
end
status = thread.value # returns Process::Status when process ends
if DEBUG
puts "thread.alive?: #{thread.alive?}"
puts "status: #{status}"
puts "status.class: #{status.class}"
puts "status.exited?: #{status.exited?}"
puts "status.exitstatus: #{status.exitstatus}"
puts "status.signaled?: #{status.signaled?}"
puts "status.termsig: #{status.termsig}"
puts "status.stopsig: #{status.stopsig}"
puts "status.stopped?: #{status.stopped?}"
puts "status.success?: #{status.success?}"
end
# See how process ended: .success? => true, false or nil if exited? !true
if status.success? == true # process exited (0)
return output
elsif status.success? == false # process exited (non-zero)
raise "command `#{command}' returned non-zero exit status (#{status.exitstatus}), see below output\n#{output}"
elsif status.signaled? # we killed the process (timeout reached)
raise "shell command `#{command}' timed out and was killed (timeout = #{timeout}s): #{status}"
else
raise "process didn't exit and wasn't signaled. We shouldn't get to here."
end
end
Hope this is useful.

Resources