Ruby readpartial on IO.pipe won't raise EOFError - ruby

I am trying to understand how the IO.pipe in Ruby works, In the example below, I send data to the io pipe and close the stream after I finish sending the data. In a child process, I read the data using the readpartial method. According to the docs readpartial should raise EOFError when the stream is closed, but I end up with a blocked process instead, Am I understanding it right?
# ruby 2.6.3
r, w = IO.pipe
pid = fork do
loop do
puts r.readpartial(1024)
sleep 0.1
rescue EOFError
puts "End of File"
break
end
end
1000.times do |t|
w.write "xyz #{t}"
end
w.close
Process.wait(pid)

You need to close the write end of the pipe in the child as well as in the parent.
After you fork you effectively have two write ends and two read ends to the pipe. When you close the write end in the parent the write end in the child is still open, so the call to readpartial will block.
You just need to call w.close immediately after forking in the child:
pid = fork do
w.close
loop do
#...
You would normally also immediately close the read end in the parent after forking (although in this case it doesn’t really matter):
pid = fork do
#...
end
r.close

Related

Trap Ctrl-D in a Ruby Script with ARGF

I am currently using ARGV.gets to capture user input from the command line. I want to allow Ctrl-D terminate the script, but don't know how to do this using Signal.trap or through error handling. I tried to find a list of trap codes for something like Ctrl-D but was unable to find anything I was looking for. Likewise, rescuing Exception doesn't work because Ctrl-D doesn't raise an exception. Is there a trap code for Ctrl-D or any other way to detect this?
For example...
I am currently able to detect Ctrl-C by trapping...
# Trap ^C
Signal.trap("INT") {
# Do something
exit
}
or error handling...
def get_input
input = ARGF.gets
input.strip!
rescue SystemExit, Interrupt => e
# If we get here, Ctrl-C was encountered
end
However, I haven't been able to trap or detect Ctrl-D.
ARGF in just a special case of stream. Ctrl + D is just end of input.
With this in mind use method ARGF.eof?. Link to documentation
I am unsure of your use case but I am assuming you are intending to do something before the script exits. If so then your best bet is probably a bit easier than signal trapping. The Kernel Module actually offers you an #at_exit method that will be executed just prior to the program actually exiting.
Usage: (from Kernel#at_exit Docs)
def do_at_exit(str1)
at_exit { print str1 }
end
at_exit { puts "cruel world" }
do_at_exit("goodbye ")
exit
"produces:"
goodbye cruel world
as you can see you can define multiple handlers which will be executed in reverse order when the program exits.
Since Kernel is included in Object you can handle Object specifics as well like
class People
at_exit {puts "The #{self.name} have left"}
end
exit
# The People have left
or even on instances
p = People.new
p.send(:at_exit, &->{puts "We are leaving"})
# We are leaving
# The People have left
Additionally for more specific Object based implementations you can take a look at ObjectSpace.define_finalizer.
example of usage:
class Person
def self.finalize(name)
proc {puts "Goodbye Cruel World -#{name}"}
end
def initialize(name)
#name = name
ObjectSpace.define_finalizer(self, self.class.finalize(#name))
end
end
Usage:
p = Person.new("engineersmnky")
exit
# Goodbye Cruel World -engineersmnky
This may not be specifically what you want as this will fire when an Object is garbage collected as well (not great for ephemeral objects) but if you have objects that should exist throughout the entire application this could still be used similar to an at_exit . Example
# requiring WeakRef to allow garbage collection
# See: https://ruby-doc.org/stdlib-2.3.3/libdoc/weakref/rdoc/WeakRef.html
require 'weakref' #
p1 = Person.new("Engineer")
p2 = Person.new("Engineer's Monkey")
p2 = WeakRef.new(p2)
GC.start # just for this example
# Goodbye Cruel World -Engineer's Monkey
#=> nil
p2
#=> WeakRef::RefError: Invalid Reference - probably recycled
exit
# Goodbye Cruel World -Engineer
As you can see the defined finalizer for p2 fired because the Person was gc'd but the program has not exited yet. p1's finalizer waited until exit to fire because it retained its reference throughout the application.

ruby get fork by pid

I have a script that runs several child processes using the fork:
def my_fork s
puts "start fork #{s}, pid #{Process.pid}"
sleep s
puts "finish"
end
forks = []
5.times do |t|
forks << fork do
my_fork t+5
end
end
begin
Process.waitall
rescue Interrupt => e
puts "interrupted!"
forks.each{|fr| Process.kill 9, fr}
end
I need the ability to stop the script by pressing Ctrl+C. But pressing time, some processes may be already dead. as it can be verified?
if you do so:
forks.each{|fr| puts fr.exited?; Process.kill 9, fr}
I get an error:
undefined method `exited?' for 27520:Fixnum (NoMethodError)
The result of Fork is the PID, so rather than fr.exited? you would need to get the process status from the process with a PID of fr. Unfortunately, Ruby does not have a good way to get the process status from a PID. See Get process status by pid in Ruby
You can simply rescue the exception if you try to kill the process and it is has already completed.
instead of:
forks.each{|fr| Process.kill 9, fr}
it would be:
forks.each do |fr|
begin
Process.kill 9, fr
rescue Errno::ESRCH
puts "process #{fr} already exited"
end
end

How to make a ruby thread execute a function of my choosing?

Is it possible to create a "worker thread" so to speak that is on standby until it receives a function to execute asynchronously?
Is there a way to send a function like
def some_function
puts "hi"
# write something
db.exec()
end
to an existing thread that's just sitting there waiting?
The idea is I'd like to pawn off some database writes to a thread which runs asynchronously.
I thought about creating a Queue instance, then have a thread do something like this:
$command = Queue.new
Thread.new do
while trigger = $command.pop
some_method
end
end
$command.push("go!")
However this does not seem like a particularly good way to go about it. What is a better alternative?
The thread gem looks like it would suit your needs:
require 'thread/channel'
def some_method
puts "hi"
end
channel = Thread.channel
Thread.new do
while data = channel.receive
some_method
end
end
channel.send("go!")
channel.send("ruby!") # Any truthy message will do
channel.send(nil) # Non-truthy message to terminate other thread
sleep(1) # Give other thread time to do I/O
The channel uses ConditionVariable, which you could use yourself if you prefer.

Timeout within a popen works, but popen inside a timeout doesn't?

It's easiest to explain in code:
require 'timeout'
puts "this block will properly kill the sleep after a second"
IO.popen("sleep 60") do |io|
begin
Timeout.timeout(1) do
while (line=io.gets) do
output += line
end
end
rescue Timeout::Error => ex
Process.kill 9, io.pid
puts "timed out: this block worked correctly"
end
end
puts "but this one blocks for >1 minute"
begin
pid = 0
Timeout.timeout(1) do
IO.popen("sleep 60") do |io|
pid = io.pid
while (line=io.gets) do
output += line
end
end
end
rescue Timeout::Error => ex
puts "timed out: the exception gets thrown, but much too late"
end
My mental model of the two blocks is identical:
So, what am I missing?
edit: drmaciver suggested on twitter that in the first case, for some reason, the pipe socket goes into non-blocking mode, but in the second it doesn't. I can't think of any reason why this would happen, nor can I figure out how to get the descriptor's flags, but it's at least a plausible answer? Working on that possibility.
Aha, subtle.
There is a hidden, blocking ensure clause at the end of the IO#popen block in the second case. The Timeout::Error is raised raised timely, but you cannot rescue it until execution returns from that implicit ensure clause.
Under the hood, IO.popen(cmd) { |io| ... } does something like this:
def my_illustrative_io_popen(cmd, &block)
begin
pio = IO.popen(cmd)
block.call(pio) # This *is* interrupted...
ensure
pio.close # ...but then control goes here, which blocks on cmd's termination
end
and the IO#close call is really more-or-less a pclose(3), which is blocking you in waitpid(2) until the sleeping child exits.
You can verify this like so:
#!/usr/bin/env ruby
require 'timeout'
BEGIN { $BASETIME = Time.now.to_i }
def xputs(msg)
puts "%4.2f: %s" % [(Time.now.to_f - $BASETIME), msg]
end
begin
Timeout.timeout(3) do
begin
xputs "popen(sleep 10)"
pio = IO.popen("sleep 10")
sleep 100 # or loop over pio.gets or whatever
ensure
xputs "Entering ensure block"
#Process.kill 9, pio.pid # <--- This would solve your problem!
pio.close
xputs "Leaving ensure block"
end
end
rescue Timeout::Error => ex
xputs "rescuing: #{ex}"
end
So, what can you do?
You'll have to do it the explicit way, since the interpreter doesn't expose a way to override the IO#popen ensure logic. You can use the above code as a starting template and uncomment the kill() line, for example.
In the first block, the timeout is raised in the child, killing it and returning control to the parent. In the second block, the timeout is raised in the parent. The child never gets the signal.
See io.c https://github.com/ruby/ruby/blob/trunk/io.c#L6021
and timeout.rb https://github.com/ruby/ruby/blob/trunk/lib/timeout.rb#L51

Testing a REPL in Ruby with RSpec and threads

I'm using RSpec to test the behavior of a simple REPL. The REPL just echoes back whatever the input was, unless the input was "exit", in which case it terminates the loop.
To avoid hanging the test runner, I'm running the REPL method inside a separate thread. To make sure that the code in the thread has executed before I write expectations about it, I've found it necessary to include a brief sleep call. If I remove it, the tests fail intermittently because the expectations are sometimes made before the code in the thread has run.
What is a good way to structure the code and spec such that I can make expectations about the REPL's behavior deterministically, without the need for the sleep hack?
Here is the REPL class and the spec:
class REPL
def initialize(stdin = $stdin, stdout = $stdout)
#stdin = stdin
#stdout = stdout
end
def run
#stdout.puts "Type exit to end the session."
loop do
#stdout.print "$ "
input = #stdin.gets.to_s.chomp.strip
break if input == "exit"
#stdout.puts(input)
end
end
end
describe REPL do
let(:stdin) { StringIO.new }
let(:stdout) { StringIO.new }
let!(:thread) { Thread.new { subject.run } }
subject { described_class.new(stdin, stdout) }
# Removing this before hook causes the examples to fail intermittently
before { sleep 0.01 }
after { thread.kill if thread.alive? }
it "prints a message on how to end the session" do
expect(stdout.string).to match(/end the session/)
end
it "prints a prompt for user input" do
expect(stdout.string).to match(/\$ /)
end
it "echoes input" do
stdin.puts("foo")
stdin.rewind
expect(stdout.string).to match(/foo/)
end
end
Instead of letting :stdout be a StringIO, you could back it by a Queue. Then when you try to read from the queue, your tests will just wait until the REPL pushes something into the queue (aka. writes to stdout).
require 'thread'
class QueueIO
def initialize
#queue = Queue.new
end
def write(str)
#queue.push(str)
end
def puts(str)
write(str + "\n")
end
def read
#queue.pop
end
end
let(:stdout) { QueueIO.new }
I just wrote this up without trying it out, and it may not be robust enough for your needs, but it gets the point across. If you use a data structure to synchronize the two threads like this, then you don't need to sleep at all. Since this removes the non-determinism, you shouldn't see the intermittent failures.
I've used a running? guard for situations like this. You probably can't avoid the sleep entirely, but you can avoid unnecessary sleeps.
First, add a running? method to your REPL class.
class REPL
...
def running?
!!#running
end
def run
#running=true
loop do
...
if input == 'exit
#running = false
break
end
...
end
end
end
Then, in your specs, sleep until the REPL is running:
describe REPL do
...
before { sleep 0.01 until REPL.running? }
...
end

Resources