Does ruby IO.gets read from a buffer? - ruby

Can someone explain to me how pipe.gets works in this instance? is IO object (pipe in this example), buffering the output? How is it that I can take my time to read the output from stdout using "gets"? I even put a sleep in there before reading through gets to make sure I'm not crazy.
def run_script(commands)
raw_output = nil
IO.popen("./db", "r+") do |pipe|
commands.each do |command|
pipe.puts command
end
pipe.close_write
# Read entire output
raw_output = pipe.gets(nil)
end
raw_output.split("\n")
end

The pipe.sync options defaults to false for popen. Meaning that the commands are buffered by Ruby until you call pipe.close_write, pipe.close, exit the program or the buffer is full. The buffer is then flushed, providing all buffered data written to the other program.
For more info check out the IO#sync documentation and/or What does “file.sync = true” do?
Based on your comment I'm not sure I understand the question. The only reason I can think of why pipe.gets(nil) would return nil is if the other program has no output.
The other option is that pipe.gets(nil) will block. This could happen if pipe is never flushed, meaning the other program is still waiting for the input. Because the other program now blocks and hasn't closed their standard output pipe.gets(nil) will also block. pipe.gets(nil) will read everything from the standard input. How does it know everything is read? It knows because a closed connection means there can't be send any data, thus it waits for a closed connection.
Ruby stdout -------> stdin ./db
stdin <-------- stdout
(blocked because empty)
pipe.puts("a") "a\n" -------> empty <read from stdin>
empty <-------- empty
(still blocked)
pipe.puts("b") "a\nb\n" -----> empty <read from stdin>
empty <-------- empty
(flushing stdout) (continues consuming "a\nb\n")
pipe.close_write empty ----/---> "a\n\b" <read from stdin>
empty <-------- empty
(blocks because empty)
pipe.gets(nil) empty ----/---> empty <send data to stdout>
empty <-------- "data\n"
(continues consuming "data\n") (flushing stdout)
pipe.gets(nil) empty ----/---> empty <close stdout>
"data\n" <--/-- empty
I hope the above helps demonstrate the process of your current code.

Related

Why is Ruby STDOUT buffering when I don't expect it to?

The following code is a simplification of my current situation. I have a JSON log source which I continuously fetch and write to stdout with puts.
#!/usr/bin/env ruby
require "json"
loop do
puts({ value: "foobar" }.to_json)
sleep 1
end
I want to be able to pipe the output of this script into jq for further processing, but in a 'stream'-friendly way, using unix pipes. Running the above code like so:
./my_script | jq
Results in an empty output. However, if I place an exit statement after the sleep call, the output is sent through the pipe to jq as expected. I was able to solve this problem by calling $stdout.flush following the puts call. While it's working now, I'm not sure why. $stdout.sync is set to true by default (see IO#sync). It seems to me that if sync was enabled, then Ruby should be doing no output buffering, and calling $stdout.flush should not be required - yet it is.
My follow-up question is about using tail instead of jq. It seems to me that I should be able to pipe a text stream into tail the same way I pipe it into jq, but neither method (with the $stdout.flush call or without it) works - the output is just empty.
As #Ry points out in the comments, $stdout.sync is true by default in IRB, but this is not necessarily the same for scripts.
So you should set $stdout.sync = true to be sure to prevent buffering.

Calling a Perl script from Ruby

I am currently attempting to figure out a way to call a Perl script from Ruby and have it output as if I was in the terminal and would allow me to provide input if it is needed.
I have figured out how I can do this and get the input after the fact but because the Perl script is still running, I am not able to run anything else.
I should note that I can not edit the Perl scripts. These scripts are being provided and this Ruby script is being made to make the process of running all of the Perl scripts easier and ensuring they are in the right order.
upgradestatus = `#{upgradearray[arraylocation]}`
This would be the relevant part my code for this. I have attempted a few other variations of how to do this but I get the same situation every time. When the script starts running it requires input so it just sits there.
You can't do what you want using backticks, %x or as a normal sub-shell, because they lack the ability to watch the output of the sub-command's output.
You could do it using Open3's popen2 or popen3 methods. They let you send to the STDIN stream for the called program, and receive data from the STDOUT. popen3 also lets you see/capture the STDOUT stream too. Unfortunately, often you have to send, then close the STDIN stream before the called program will return its information, which might be the case of the Perl scripts.
If you need more control, look into using Ruby's built-in Pty module. It's designed to let you talk to a running app through a scripting mechanism. You have to set up code to look for prompts, then respond to them by sending back the appropriate data. It can be simple, or it can be a major PITA, depending on the code you're talking to.
This is the example for the open command:
PTY.open {|m, s|
p m #=> #<IO:masterpty:/dev/pts/1>
p s #=> #<File:/dev/pts/1>
p s.path #=> "/dev/pts/1"
}
# Change the buffering type in factor command,
# assuming that factor uses stdio for stdout buffering.
# If IO.pipe is used instead of PTY.open,
# this code deadlocks because factor's stdout is fully buffered.
require 'io/console' # for IO#raw!
m, s = PTY.open
s.raw! # disable newline conversion.
r, w = IO.pipe
pid = spawn("factor", :in=>r, :out=>s)
r.close
s.close
w.puts "42"
p m.gets #=> "42: 2 3 7\n"
w.puts "144"
p m.gets #=> "144: 2 2 2 2 3 3\n"
w.close
# The result of read operation when pty slave is closed is platform
# dependent.
ret = begin
m.gets # FreeBSD returns nil.
rescue Errno::EIO # GNU/Linux raises EIO.
nil
end
p ret #=> nil

Problem redirecting stdout in Ruby script

I have the following test Ruby script:
require 'tempfile'
tempfile = Tempfile.new 'test'
$stderr.reopen tempfile
$stdout.reopen tempfile
puts 'test stdout'
warn 'test stderr'
`mail -s 'test' my#email.com < #{tempfile.path}`
tempfile.close
tempfile.unlink
$stderr.reopen STDERR
$stdout.reopen STDOUT
The email that I get has the contents:
test stderr
Why is stderr redirecting properly but not stdout?
Edit: In response to a comment I added a $stdout.flush after the puts line and it printed correctly. So I'll restate my question: what was happening and why does the flush fix it?
The standard output is generally buffered to avoid a system call for every write. So, when you say this:
puts 'test stdout'
You're actually just stuffing that string into the buffer. Then you say this:
`mail -s 'test' my#email.com < #{tempfile.path}`
and your 'test stdout' string is still in the buffer so it isn't in tempfile when mail sends the file's content to you. Flushing $stdout forces everything in the buffer to be written to disk; from the fine manual:
Flushes any buffered data within ios to the underlying operating system (note that this is Ruby internal buffering only; the OS may buffer the data as well).
$stdout.print "no newline"
$stdout.flush
produces:
no newline
The standard error is often unbuffered so that error messages (which are supposed to be rare) are visible immediately.

Why can't open4 read from stdout when the program is waiting for stdin?

I am using the open4 gem and having problems reading from the spawned processes stdout. I have a ruby program, test1.rb:
print 'hi.' # 3 characters
$stdin.read(1) # block
And another ruby program in the same directory, test2.rb:
require 'open4'
pid, stdin, stdout, stderr = Open4.popen4 'ruby test1.rb'
p stdout.read(2) # 2 characters
When I run the second program:
$ ruby test2.rb
It just sits there forever without printing anything. Why does this happen, and what can I do to stop it?
I needed to change test1.rb to this. I don't know why.
print 'hi.' # 3 characters
$stdout.flush
$stdin.read(1) # block
By default, everything that you printto stdout or to another file is written into a buffer of Ruby (or the standard C library, which is underneath Ruby). The content of the buffer is forwarded to the OS if one of the following events occurs:
The buffer gets full.
You close stdout.
You have printed a newline sequence (`\n')
You call flush explicitly.
For other files, a flush is done on other occasions, too, like ftell.
If you put stdout in unbuffered mode ($stdout.sync = true), the buffer will not be used.
stderr is unbuffered by default.
The reason for doing buffering is efficiency: Aggregating output data in a buffer can save many system call (calls to operating system). System calls are very expensive: They take many hundreds or even thousands of CPU cycles. Avoiding them with a little bit of code and some buffers in user space results in a good speedup.
A good reading on buffering: Why does printf not flush after the call unless a newline is in the format string?
I'm not an expert in process.
From my first sight of API document, the sequence of using open4 is like this:
first send text to stdin, then close stdin and lastly read text from stdout.
So. You can the test2.rb like this
require 'open4'
pid, stdin, stdout, stderr = Open4.popen4 'ruby test1.rb'
stdin.puts "something" # This line is important
stdin.close # It might be optional, open4 might close itself.
p stdout.read(2) # 2 characters

Broken pipe (Errno::EPIPE)

i have a Broken pipe (Errno::EPIPE) error popping up and i don't understand what it is or how to fix it. the full error is:
example.rb:19:in `write': Broken pipe (Errno::EPIPE)
from example.rb:19:in `print'
from example.rb:19
line 19 of my code is:
vari.print("x=" + my_val + "&y=1&z=Add+Num\r\n")
It means that whatever connection print is outputting to is no longer connected. Presumably the program began as input to some other program:
% ruby_program | another_program
What's happened is that another_program has exited sometime before the print in question.
Note:
The 1st section applies to Ruby scripts designed to act as terminal-based command-line utilities, assuming they require no custom handling or cleanup on receiving SIGPIPE, and assuming that you want them to exhibit the behavior of standard Unix utilities such as cat, which terminate quietly with a specific exit code when receiving SIGPIPE.
The 2nd section is for scripts that require custom handling of SIGPIPE, such as explicit cleanup and (conditional) output of error messages.
Opting into the system's default handling of SIGPIPE:
To complement wallyk's helpful answer and tokland's helpful answer:
If you want your script to exhibit the system's default behavior, as most Unix utilities (e.g., cat) do, use
Signal.trap("SIGPIPE", "SYSTEM_DEFAULT")
at the beginning of your script.
Now, when your script receives the SIGPIPE signal (on Unix-like systems), the system's default behavior will:
quietly terminate your script
report exit code 141 (which is calculated as 128 (indicating termination by signal) + 13 (SIGPIPE's number))
(By contrast, Signal.trap("PIPE", "EXIT") would report exit code 0, on receiving the signal, which indicates success.)
Note that in a shell context the exit code is often not apparent in a command such as ruby examble.rb | head, because the shell (by default) only reports the last command's exit code.
In bash, you can examine ${PIPESTATUS[#]} to see the exit codes of all commands in the pipeline.
Minimal example (run from bash):
ruby -e "Signal.trap('PIPE','SYSTEM_DEFAULT');(1..1e5).each do|i| puts i end" | head
The Ruby code tries to output 100,000 lines, but head only outputs the first 10 lines and then exits, which closes the read end of the pipe that connects the two commands.
The next time the Ruby code tries to the write end of that now broken pipe (after filling up the pipeline buffer), it triggers signal SIGPIPE, which terminates the Ruby process quietly, with exit code 141, which you can verify with echo ${PIPESTATUS[0]} afterwards.
By contrast, if you removed Signal.trap('PIPE','SYSTEM_DEFAULT'), i.e. with Ruby's default behavior, the command would break noisily (several lines of stderr output), and the exit code would be the nondescript 1.
Custom handling of SIGPIPE:
The following builds on donovan.lampa's helpful answer and adds an improvement suggested by
Kimmo Lehto, who points out that, depending on your script's purpose, receiving SIGPIPE shouldn't always terminate quietly, because it may indicate a legitimate error condition, notably in network code such as code for downloading a file from the internet.
He recommends the following idiom for that scenario:
begin
# ... The code that could trigger SIGPIPE
rescue Errno::EPIPE
# ... perform any cleanup, logging, ... here
# Raise an exception - which translates into stderr output -
# but only when outputting directly to a terminal.
# That way, failure is quiet inside a pipeline, such as when
# piping to standard utility `head`, where SIGPIPE is an expected
# condition.
raise if $stdout.tty?
# If the stack trace that the `raise` call results in is too noisy
# use something like the following instead, which outputs just the
# error message itself to stderr:
# $stderr.puts $! if $stdout.tty?
# Or, even simpler:
# warn $! if $stdout.tty?
# Exit with the usual exit code that indicates termination by SIGPIPE
exit 141
end
As a one-liner:
... rescue Errno::EPIPE raise if $stdout.tty?; exit 141
Note: Rescuing Errno::EPIPE works, because if the signal is ignored, the system call writing to the pipeline returns to the caller (instead of the caller process getting terminated), namely with standard error code EPIPE, which Ruby surfaces as exception Errno::EPIPE.
Although signal traps do work, as tokland said, they are defined application wide and can cause some unexpected behavior if you want to handle a broken pipe in some other way somewhere else in your app.
I'd suggest just using a standard rescue since the error still inherits from StandardError. More about this module of errors: http://ruby-doc.org/core-2.0.0/Errno.html
Example:
begin
vari.print("x=" + my_val + "&y=1&z=Add+Num\r\n")
rescue Errno::EPIPE
puts "Connection broke!"
end
Edit: It's important to note (as #mklement0 does in the comments) that if you were originally piping your output using puts to something expecting output on STDOUT, the final puts in the code above will raise another Errno::EPIPE exception. It's probably better practice to use STDERR.puts anyway.
begin
vari.print("x=" + my_val + "&y=1&z=Add+Num\r\n")
rescue Errno::EPIPE
STDERR.puts "Connection broke!"
end
#wallyk is right on the problem. One solution is to capture the signal with Signal.trap:
Signal.trap("PIPE", "EXIT")
If you are aware of some problem with this approach, please add a comment below.

Resources