Socket.read() won't block in Ruby - ruby

I have the following code for a simple TCP server in Ruby:
# server.rb
require 'socket'
class Server
def initialize(port)
#port = port
end
def run
Socket.tcp_server_loop(#port) do |connection|
Thread.new do
loop do
puts "IO: #{IO.select([connection]).inspect} - data: #{connection.read}"
end
end
end
end
end
server = Server.new(16451)
server.run
As well as this trivial TCP client code:
# client.rb
require 'socket'
client = TCPSocket.new('localhost', 16451)
client.write('stuff')
It is my understanding that connection.read in server.rb should block if no data is present on the socket. However, when I run this on my macbook (OS X 10.12.5), it keeps spitting out the following output:
IO: [[#<Socket:fd 12>], [], []] - data: stuff
IO: [[#<Socket:fd 12>], [], []] - data:
IO: [[#<Socket:fd 12>], [], []] - data:
IO: [[#<Socket:fd 12>], [], []] - data:
IO: [[#<Socket:fd 12>], [], []] - data:
IO: [[#<Socket:fd 12>], [], []] - data:
IO: [[#<Socket:fd 12>], [], []] - data:
IO: [[#<Socket:fd 12>], [], []] - data:
...
It seems that IO.select thinks there is data available to read on the socket, while no such data has been sent.
How can I achieve a blocking read when working with sockets in Ruby? Am I overlooking something?
Matt's answer pointed me in the right direction. For future readers, here's my new code.
# server.rb
require 'socket'
class Server
BYTESIZE_OF_PACKED_INTEGER = [1].pack('i').bytesize
def initialize(port)
#port = port
end
def run
Socket.tcp_server_loop(#port) do |connection|
Thread.new do
while packed_msg_bytesize = connection.read(BYTESIZE_OF_PACKED_INTEGER)
msg_bytesize = packed_msg_bytesize.unpack('i').first
msg = connection.read(msg_bytesize)
puts msg
end
end
end
end
end
server = Server.new(16451)
server.run
And the client code.
# client.rb
require 'socket'
msg = 'stuff'
msg_bytesize = msg.bytesize
packed_msg_bytesize = [msg_bytesize].pack('i')
client = TCPSocket.new('localhost', 16451)
client.write(packed_msg_bytesize)
client.write(msg)

read will block if there is no data, but not at EOF. The IO#read docs say:
When this method is called at end of file, it returns nil or "", depending on length: read, read(nil), and read(0) return "", read(positive_integer) returns nil.
Since calling read on EOF doesn’t block, select will return the IO as readable straight away.
In your code the first call to read will block until all the data is read from the connection (i.e. the other end has closed it). From then it will be at EOF, so select will return it as ready, and read will return an empty string immediately.

Related

Ruby HTTP2 GET request

I'm trying to use the Ruby gem http-2 to send a GET request to Google.
I've lifted the code directly from the example and simplified it slightly:
require 'http/2'
require 'socket'
require 'openssl'
require 'uri'
uri = URI.parse('http://www.google.com/')
tcp = TCPSocket.new(uri.host, uri.port)
sock = tcp
conn = HTTP2::Client.new
conn.on(:frame) do |bytes|
# puts "Sending bytes: #{bytes.unpack("H*").first}"
sock.print bytes
sock.flush
end
conn.on(:frame_sent) do |frame|
puts "Sent frame: #{frame.inspect}"
end
conn.on(:frame_received) do |frame|
puts "Received frame: #{frame.inspect}"
end
stream = conn.new_stream
stream.on(:close) do
puts 'stream closed'
sock.close
end
stream.on(:half_close) do
puts 'closing client-end of the stream'
end
stream.on(:headers) do |h|
puts "response headers: #{h}"
end
stream.on(:data) do |d|
puts "response data chunk: <<#{d}>>"
end
head = {
':scheme' => uri.scheme,
':method' => 'GET',
':path' => uri.path
}
puts 'Sending HTTP 2.0 request'
stream.headers(head, end_stream: true)
while !sock.closed? && !sock.eof?
data = sock.read_nonblock(1024)
# puts "Received bytes: #{data.unpack("H*").first}"
begin
conn << data
rescue => e
puts "#{e.class} exception: #{e.message} - closing socket."
e.backtrace.each { |l| puts "\t" + l }
sock.close
end
end
The output is:
Sending HTTP 2.0 request
Sent frame: {:type=>:settings, :stream=>0, :payload=>[[:settings_max_concurrent_streams, 100]]}
Sent frame: {:type=>:headers, :flags=>[:end_headers, :end_stream], :payload=>[[":scheme", "http"], [":method", "GET"], [":path", "/"]], :stream=>1}
closing client-end of the stream
(Note: you get pretty much the same output as above by running the actual example file, i.e., ruby client.rb http://www.google.com/)
Why is no response data being displayed?
Public servers like google.com do not support HTTP/2 in clear text.
You are trying to connect to http://google.com, while you should really connect to https://google.com (note the https scheme).
In order to do that, you may need to wrap the TCP socket using TLS (see for example here), if http-2 does not do it for you.
Note also that HTTP/2 requires strong TLS ciphers and ALPN, so make sure that you have an updated version of OpenSSL (at least 1.0.2).
Given that the author of http-2 is a strong HTTP/2 supporter, I am guessing that your only problem is the fact that you tried clear-text http rather than https, and I expect that TLS cipher strength and ALPN are taken care of by the http-2 library.

Workaround for Timeouts with Http.rb and Celluloid?

I know that current timeouts are currently not supported with Http.rb and Celluloid[1], but is there an interim workaround?
Here's the code I'd like to run:
def fetch(url, options = {} )
puts "Request -> #{url}"
begin
options = options.merge({ socket_class: Celluloid::IO::TCPSocket,
timeout_class: HTTP::Timeout::Global,
timeout_options: {
connect_timeout: 1,
read_timeout: 1,
write_timeout: 1
}
})
HTTP.get(url, options)
rescue HTTP::TimeoutError => e
[do more stuff]
end
end
Its goal is to test a server as being live and healthy. I'd be open to alternatives (e.g. %x(ping <server>)) but these seem less efficient and actually able to get at what I'm looking for.
[1] https://github.com/httprb/http.rb#celluloidio-support
You can set timeout on future calls when you fetch for the request
Here is how to use timeout with Http.rb and Celluloid-io
require 'celluloid/io'
require 'http'
TIMEOUT = 10 # in sec
class HttpFetcher
include Celluloid::IO
def fetch(url)
HTTP.get(url, socket_class: Celluloid::IO::TCPSocket)
rescue Exception => e
# error
end
end
fetcher = HttpFetcher.new
urls = %w(http://www.ruby-lang.org/ http://www.rubygems.org/ http://celluloid.io/)
# Kick off a bunch of future calls to HttpFetcher to grab the URLs in parallel
futures = urls.map { |u| [u, fetcher.future.fetch(u)] }
# Consume the results as they come in
futures.each do |url, future|
# Wait for HttpFetcher#fetch to complete for this request
response = future.value(TIMEOUT)
puts "*** Got #{url}: #{response.inspect}\n\n"
end

In-memory stream for Ruby

When working with Protocol Buffers the real message size becomes known when a whole object is written to IO. So I use following approach: write object to intermediate stream, get it's size and then write whole data, with header containing size int, to TCP socket.
What I do not like in following code is the message_size function which uses real disk file instead of memory stream.
require 'protocol_buffers'
module MyServer
class AuthRequest < ProtocolBuffers::Message
required :int32, :vers, 1
required :int32, :orgID, 2
required :string, :password, 3
end
class MyServer
def self.authenticate(socket, params)
auth = AuthRequest.new(:vers => params[:vers], :orgID => params[:orgID], :password => params[:password])
size = message_size(auth)
if size.present?
socket.write([size, 0].pack 'NN')
auth.serialize(socket)
socket.flush
end
end
def self.message_size(obj)
size = nil
io = File.new('tempfile', 'w')
begin
obj.serialize(io)
io.flush
size = io.stat.size + 4
ensure
io.close
end
size
end
end
end
Controller:
require 'my_server'
require 'socket'
class MyServerTestController < ActionController::Base
def test
socket = TCPSocket.new('192.168.1.15', '12345')
begin
MyServer::MyServer.authenticate(socket, {vers: 1, orgID: 100, password: 'hello'})
ensure
socket.close
end
end
end
You can easily use StringIO as your memory stream. Mind you, it is called StringIO since it is implemented on a string, and it is definitely not bound for it being string data - it works just as easily on binary data:
def self.message_size_mem(obj)
size = nil
io = StringIO.new
begin
obj.serialize(io)
io.flush
size = io.size + 4
ensure
io.close
end
size
end
auth = AuthRequest.new(:vers => 122324, orgID: 9900235, password: 'this is a test for serialization')
MyServer.message_size(auth)
# => 47
MyServer.message_size_mem(auth)
# => 47
io = StringIO.new
auth.serialize(io)
io.flush
io.string
# => "\bԻ\a\u0010ˡ\xDC\u0004\u001A this is a test for serialization"

streaming html from webrick?

Has anyone tried streaming html/text/content from webrick? I've tried assigning an IO to the response body, but webrick is waiting for the stream to be closed first.
found this link by accident (http://redmine.ruby-lang.org/attachments/download/161) which contains webrick patch
# Copyright (C) 2008 Brian Candler, released under Ruby Licence.
#
# A collection of small monkey-patches to webrick.
require 'webrick'
module WEBrick
class HTTPRequest
# Generate HTTP/1.1 100 continue response. See
# http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/18459
def continue
if self['expect'] == '100-continue' && #config[:HTTPVersion] >= "1.1"
#socket.write "HTTP/#{#config[:HTTPVersion]} 100 continue\r\n\r\n"
#header.delete('expect')
end
end
end
class HTTPResponse
alias :orig_setup_header :setup_header
# Correct termination of streamed HTTP/1.1 responses. See
# http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/18454 and
# http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/18565
def setup_header
orig_setup_header
unless chunked? || #header['content-length']
#header['connection'] = "close"
#keep_alive = false
end
end
# Allow streaming of zipfile entry. See
# http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/18460
def send_body(socket)
if #body.respond_to?(:read) then send_body_io(socket)
elsif #body.respond_to?(:call) then send_body_proc(socket)
else send_body_string(socket)
end
end
# If the response body is a proc, then we invoke it and pass in
# an object which supports "write" and "<<" methods. This allows
# arbitary output streaming.
def send_body_proc(socket)
if #request_method == "HEAD"
# do nothing
elsif chunked?
#body.call(ChunkedWrapper.new(socket, self))
_write_data(socket, "0#{CRLF}#{CRLF}")
else
size = #header['content-length'].to_i
#body.call(socket) # TODO: StreamWrapper which supports offset, size
#sent_size = size
end
end
class ChunkedWrapper
def initialize(socket, resp)
#socket = socket
#resp = resp
end
def write(buf)
return if buf.empty?
data = ""
data << format("%x", buf.size) << CRLF
data << buf << CRLF
socket = #socket
#resp.instance_eval {
_write_data(socket, data)
#sent_size += buf.size
}
end
alias :<< :write
end
end
end
if RUBY_VERSION < "1.9"
old_verbose, $VERBOSE = $VERBOSE, nil
# Increase from default of 4K for efficiency, similar to
# http://svn.ruby-lang.org/cgi-bin/viewvc.cgi/branches/ruby_1_8/lib/net/protocol.rb?r1=11708&r2=12092
# In trunk the default is 64K and can be adjusted using :InputBufferSize,
# :OutputBufferSize
WEBrick::HTTPRequest::BUFSIZE = 16384
WEBrick::HTTPResponse::BUFSIZE = 16384
$VERBOSE = old_verbose
end
to use simply pass a proc to as the response body, like so
res.body = proc { |w|
10.times do
w << Time.now.to_s
sleep(1)
end
}
woot!
I would suggest against using WEBrick for anything really, it's junk. I would say try Mongrel.
I know that wasn't your question, it's just some friendly advice.

Recovering from a broken TCP socket in Ruby when in gets()

I'm reading lines of input on a TCP socket, similar to this:
class Bla
def getcmd
#sock.gets unless #sock.closed?
end
def start
srv = TCPServer.new(5000)
#sock = srv.accept
while ! #sock.closed?
ans = getcmd
end
end
end
If the endpoint terminates the connection while getline() is running then gets() hangs.
How can I work around this? Is it necessary to do non-blocking or timed I/O?
You can use select to see whether you can safely gets from the socket, see following implementation of a TCPServer using this technique.
require 'socket'
host, port = 'localhost', 7000
TCPServer.open(host, port) do |server|
while client = server.accept
readfds = true
got = nil
begin
readfds, writefds, exceptfds = select([client], nil, nil, 0.1)
p :r => readfds, :w => writefds, :e => exceptfds
if readfds
got = client.gets
p got
end
end while got
end
end
And here a client that tries to break the server:
require 'socket'
host, port = 'localhost', 7000
TCPSocket.open(host, port) do |socket|
socket.puts "Hey there"
socket.write 'he'
socket.flush
socket.close
end
The IO#closed? returns true when both reader and writer are closed.
In your case, the #sock.gets returns nil, and then you call the getcmd again, and this runs in a never ending loop. You can either use select, or close the socket when gets returns nil.
I recommend using readpartial to read from your socket and also catching peer resets:
while true
sockets_ready = select(#sockets, nil, nil, nil)
if sockets_ready != nil
sockets_ready[0].each do |socket|
begin
if (socket == #server_socket)
# puts "Connection accepted!"
#sockets << #server_socket.accept
else
# Received something on a client socket
if socket.eof?
# puts "Disconnect!"
socket.close
#sockets.delete(socket)
else
data = ""
recv_length = 256
while (tmp = socket.readpartial(recv_length))
data += tmp
break if (!socket.ready?)
end
listen socket, data
end
end
rescue Exception => exception
case exception
when Errno::ECONNRESET,Errno::ECONNABORTED,Errno::ETIMEDOUT
# puts "Socket: #{exception.class}"
#sockets.delete(socket)
else
raise exception
end
end
end
end
end
This code borrows heavily from some nice IBM code by M. Tim Jones. Note that #server_socket is initialized by:
#server_socket = TCPServer.open(port)
#sockets is just an array of sockets.
I simply pgrep "ruby" to find the pid, and kill -9 the pid and restart.
If you believe the rdoc for ruby sockets, they don't implement gets. This leads me to believe gets is being provided by a higher level of abstraction (maybe the IO libraries?) and probably isn't aware of socket-specific things like 'connection closed.'
Try using recvfrom instead of gets

Resources