reconnect tcpsocket (or how to detect closed socket) - ruby

I have a ruby tcpsocket client that is connected to a server.
How can I check to see if the socket is connected before I send the data ?
Do I try to "rescue" a disconnected tcpsocket, reconnect and then resend ? if so, does anyone have a simple code sample as I don't know where to begin :(
I was quite proud that I managed to get a persistent connected client tcpsocket in rails. Then the server decided to kill the client and it all fell apart ;)
edit
I've used this code to get round some of the problems - it will try to reconnect if not connected, but won't handle the case if the server is down (it will keep retrying). Is this the start of the right approach ? Thanks
def self.write(data)
begin
##my_connection.write(data)
rescue Exception => e
##my_connection = TCPSocket.new 'localhost', 8192
retry
end
end

What I usually do in these types of scenarios is keep track of consecutive retries in a variable and have some other variable that sets the retry roof. Once we hit the roof, throw some type of exception that indicates there is a network or server problem. You'll want to reset the retry count variable on success of course.

Related

SSL_write: bad write retry. Exception. Readin e-mails with IMAP IDLE. In Ruby

I woul like to get unseen mails "as soon as possible", using a Ruby (2.1) script to implement IMAP IDLE ("push notify") feature.
With the help of some guys (see also: Support for IMAP IDLE in ruby), I wrote the script here:
https://gist.github.com/solyaris/b993283667f15effa579
def idle_loop(imap, search_condition, folder)
# https://stackoverflow.com/questions/4611716/how-imap-idle-works
loop do
begin
imap.select folder
imap.idle do |resp|
#trap_shutdown
# You'll get all the things from the server.
#For new emails you're only interested in EXISTS ones
if resp.kind_of?(Net::IMAP::UntaggedResponse) and resp.name == "EXISTS"
# Got something. Send DONE. This breaks you out of the blocking call
imap.idle_done
end
end
# We're out, which means there are some emails ready for us.
# Go do a search for UNSEEN and fetch them.
retrieve_emails(imap, search_condition, folder) { |mail| process_email mail}
#rescue Net::IMAP::Error => imap_err
# Socket probably timed out
# puts "IMAP IDLE socket probably timed out.".red
rescue SignalException => e
# https://stackoverflow.com/questions/2089421/capturing-ctrl-c-in-ruby
puts "Signal received at #{time_now}: #{e.class} #{e.message}".red
shutdown imap
rescue Exception => e
puts "Something went wrong at #{time_now}: #{e.class} #{e.message}".red
imap.noop
end
end
end
Now, all run smootly at first glance, BUT I have the exception
Something went wrong: SSL_write: bad write retry
at this line in code:
https://gist.github.com/solyaris/b993283667f15effa579#file-idle-rb-L189
The error happen when I leave the script running for more than... say more than 30 minutes.
BTW, the server is imap.gmail.com (arghh...), and I presume is something related to IMAP IDLE reconnection socket (I din't read yet the ruby UMAP library code) but I do not understand the reason of the exception;
Any idea for the reason if the exception ? Just trap the exception to fix the issue ?
thanks
giorgio
UPDATE
I modified a bit the exception handling (see gist code: https://gist.github.com/solyaris/b993283667f15effa579)
Now I got a Net::IMAP::Error connection closed I just restart the IMAP connection and it seems working...
Sorry for confusing, anyway in general any comments on code I wrote, IDLE protocol correct management, are welcome.
The IMAP IDLE RFC says to stop IDLE after at most 29 minutes and reissue a new IDLE command. IMAP servers are permitted to assume that the client is dead and has gone away after 31 minutes of inactivity.
You may also find that some NAT middleboxes silently sabotage your connection long before the half-hour is up, I've seen timeouts as short as about two minutes. (Every time I see something like that I scream "vivat ipv6!") I don't think there's any good solution for those middleboxes, except maybe to infect them with a vile trojan, but the bad solutions include adjusting your idle timeout if you get the SSL exception before a half-hour is up.

How can I properly handle persistent TCP socket connections (to simulate an HTTP server)?

So, I'm trying to simulate some basic HTTP persistent connections using sockets and Ruby - for a college class.
The point is to build a server - able to handle multiple clients - that receives a file path and gives back the file content - just like an HTTP GET.
The current server implementation loops listening for clients, fires a new thread when there's an incoming connection and reads the file paths from this socket. It's very dumb, but it works fine when working with non-presistent connections - one request per connection.
But they should be persistent.
Which means the client shouldn't worry about closing the connection. In the non-persistent version the servers echoes the response and close the connection - goodbye client, farewell.
But being persistent means the server thread should loop and wait for more incoming requests until... well until there's no more requests. How does the server knows that? It doesn't! Some sort of timeout is needed. I tried to do that with Ruby's Timeout, but it didn't work.
Googling for some solutions - besides being thoroughly advised to avoid using Timeout module - I've seen a lot of posts about the IO.select method, that should handle this socket waiting issue way better than using threads and stuff (which really sounds cool, considering how Ruby threads (don't) work). I'm trying to understand here how IO.select works, but still wasn't able to make it work in the current scenario.
So I aske basically two things:
how can I efficiently work this timeout issue on the server-side, either using some thread based solution, low-level socket options or some IO.select magic?
how can the client side know that the server has closed its side of the connection?
Here's the current code for the server:
require 'date'
module Sockettp
class Server
def initialize(dir, port = Sockettp::DEFAULT_PORT)
#dir = dir
#port = port
end
def start
puts "Starting Sockettp server..."
puts "Serving #{#dir.yellow} on port #{#port.to_s.green}"
Socket.tcp_server_loop(#port) do |socket, client_addrinfo|
handle socket, client_addrinfo
end
end
private
def handle(socket, addrinfo)
Thread.new(socket) do |client|
log "New client connected"
begin
loop do
if client.eof?
puts "#{'-' * 100} end connection"
break
end
input = client.gets.chomp
body = content_for(input)
response = {}
if body
response.merge!({
status: 200,
body: body
})
else
response.merge!({
status: 404,
body: Sockettp::STATUSES[404]
})
end
log "#{addrinfo.ip_address} #{input} -- #{response[:status]} #{Sockettp::STATUSES[response[:status]]}".send(response[:status] == 200 ? :green : :red)
client.puts(response.to_json)
end
ensure
socket.close
end
end
end
def content_for(path)
path = File.join(#dir, path)
return File.read(path) if File.file?(path)
return Dir["#{path}/*"] if File.directory?(path)
end
def log(msg)
puts "#{Thread.current} -- #{DateTime.now.to_s} -- #{msg}"
end
end
end
Update
I was able to simulate the timeout behaviour using the IO.select method, but the implementation doesn't feel good when combining with a couple of threads for accepting new connections and another couple for handling requests. The concurrency makes the situation mad and unstable, and I'm probably not sticking with it unless I can figure out a better way of using this solution.
Update 2
Seems like Timeout is still the best way to handle this. I'm sticking with it till find a better option.
I still don't know how to deal with zombie client connections.
Solution
I endend up using IO.select (got inspired when looking at the webrick code). You cha check the final version here (lib/http/server/client_handler.rb)
You should implement something like heartbeat packets.Client side should send special packets to after few secs/mins to ensure that server doesn't time out the connection on the client end.You just avoid doing anything in this call.

Elegant way to stop socket read operation from outside

I implemented a small client server application in Ruby and I have the following problem: The server starts a new client session in a new thread for each connecting client, but it should be possible to shutdown the server and stop all the client sessions in a 'polite' way from outside without just killing the thread while I don't know which state it is in.
So I decided that the client session object gets a `stop' flag which can be set from outside and is checked before each action. The problem is that it should not wait for the client, if it is just waiting for a request. I have the following temporary solution:
def read_client
loop do
begin
timeout(1) { return #client.gets }
rescue Timeout::Error
if #stop
stop # Notifies the client and closes the connection
return nil
end
end
end
end
But that sucks, looks terrible and intuitively, this should be such a normal thing that there has to be a `normal' solution to it. I don't even know if it is safe or if it could happen that the gets operation reads part of the client request, but not all of it.
Another side question is, if setting/getting a boolean flag is an atomic operation in Ruby (or if I need an additional Mutex for the flag).
Thread-per-client approach is usually a disaster for server design. Also blocking I/O is difficult to interrupt without OS-specific tricks. Check out non-blocking sockets, see for example, answers to this question.

How can I tell if my Ruby server script is being overloaded?

I have a daemonized ruby script running on my server that looks like this:
#server = TCPServer.open(61101)
loop do
#thr = Thread.new(#server.accept) do |sock|
Thread.current[:myArrayOfHashes] = [] # hashes containing attributes of myObject
SystemTimer.timeout_after(5) do
Thread.current[:string] = sock.gets
sock.close
# parse the string and load the data into myArrayOfHashes
Myobject.transaction do # Update the myObjects Table
Thread.current[:myArrayOfHashes].each do |h|
Thread.current[:newMyObject] = Myobject.new
# load up the new object with data
Thread.current[:newMyObject].save
end
end
end
end
#thr.join
end
This server receives and manages data for my rails application which is all running on Mac OS 10.6. The clients call the server every 15 minutes on the 15 and while I currently only have 16 or so clients calling every 15 min on the 15, I'm wondering about the following:
If two clients call at close enough to the same time, will one client's connection attempt fail?
How I can figure out how many client connections my server can accommodate at the same time?
How can I monitor how much memory my server is using?
Also, is there an article you can point me toward that discusses the best way to implement this kind of a server? I mean can I have multiple instances of the server listening on the same port? Would that even help?
I am using Bluepill to monitor my server daemons.
1 and 2
The answer is no, two clients connecting close to each other will not make the connection fail (however multiple clients connecting may fail, see below).
The reason is the operating system has a default so called listening queue built into all server sockets. So even if you are not calling accept fast enough in your program, the OS will still keep buffering incoming connections for you. It will buffer these connections for as long as the listening queue does not get filled.
Now what is the size of this queue then?
In most cases the default size typically used is 5. The size is set after you create the socket and you call listen on this socket (see man page for listen here).
For Ruby TCPSocket automatically calls listen for you, and if you look at the C-source code for TCPSocket you will find that it indeed sets the size to 5:
https://github.com/ruby/ruby/blob/trunk/ext/socket/ipsocket.c#L108
SOMAXCONN is defined as 5 here:
https://github.com/ruby/ruby/blob/trunk/ext/socket/mkconstants.rb#L693
Now what happens if you don't call accept fast enough and the queue gets filled?
The answer is found in the man page of listen:
The backlog argument defines the maximum length to which the queue of pending connections for sockfd may grow. If a connection request arrives when the queue is full, the client may receive an error with an indication of ECONNREFUSED or, if the underlying protocol supports retransmission, the request may be ignored so that a later reattempt at connection succeeds.
In your code however there is one problem which can make the queue fill up if more than 5 clients try to connect at the same time: you're calling #thr.join at the end of the loop.
What effectively happens when you do this is that your server will not accept any new incoming connections until all your stuff inside your accept-thread has finished executing.
So if the database stuff and the other things you are doing inside the accept-thread takes a long time, the listening queue may fill up in the meantime. It depends on how long your processing takes, and how many clients could potentially be connecting at the exact same time.
3
You didn't say which platform you are running on, but on linux/osx the easiest way is to just run top in your console. For more advanced memory monitoring options you might want to check these out:
ruby/ruby on rails memory leak detection
track application memory usage on heroku

Ruby TCPSocket doesn't notice it when server is killed

I've this ruby code that connects to a TCP server (namely, netcat). It loops 20 times, and sends "ABCD ". If I kill netcat, it takes TWO iterations of the loop for an exception to be triggered. On the first loop after netcat is killed, no exception is triggered, and "send" reports that 5 bytes have been correctly written... Which in the end is not true, since of course the server never received them.
Is there a way to work around this issue ? Right now I'm losing data : since I think it's been correctly transfered, I'm not replaying it.
#!/usr/bin/env ruby
require 'rubygems'
require 'socket'
sock = TCPSocket.new('192.168.0.10', 5443)
sock.sync = true
20.times do
sleep 2
begin
count = sock.write("ABCD ")
puts "Wrote #{count} bytes"
rescue Exception => myException
puts "Exception rescued : #{myException}"
end
end
When you're sending data your blocking call will return when the data is written to the TCP output buffer. It would only block if the buffer was full, waiting for the server to acknowledge receipt of previous data that was sent.
Once this data is in the buffer, the network drivers try to send the data. If the connection is lost, on the second attempt to write, your application discovers the broken state of the connection.
Also, how does the connection close? Is the server actively closing the connection? In which case client socket would be notified at its next socket call. Or has it crashed? Or perhaps there's a network fault which means you can no longer communicate.
Discovering a broken connection only occurs when you try to send or receive data over the socket. This is different from having the connection actively closed. You simply can't determine if the connection is still alive without doing something with it.
So try doing sock.recv(0) after the write - if the socket has failed this would raise "Errno::ECONNRESET: Connection reset by peer - recvfrom(2)". You could also try sock.sendmsg "", 0 (not sock.write, or sock.send), and this would report a "Errno::EPIPE: Broken pipe - sendmsg(2)".
Even if you got your hands on the TCP packets and get acknowledgement that the data had been received at the other end, there's still no guarantee that the server will have processed this data - it might in its input buffer but not yet processed.
All of this might help identify a broken connection earlier, but it still won't guarantee that the data was received and processed by the server. The only sure way to know that the application has processed your message is with an application level response.
I tried without the sleep function (just to make sure it wasn't putting on hold anything) and still no luck:
#!/usr/bin/env ruby
require 'rubygems'
require 'socket'
require 'activesupport' # Fixnum.seconds
sock = TCPSocket.new('127.0.0.1', 5443)
sock.sync = true
will_restart_at = Time.now + 2.seconds
should_continue = true
while should_continue
if will_restart_at <= Time.now
will_restart_at = Time.now + 2.seconds
begin
count = sock.write("ABCD ")
puts "Wrote #{count} bytes"
rescue Exception => myException
puts "Exception rescued : #{myException}"
should_continue = false
end
end
end
I analyzed with Wireshark and the two solutions are exactly behaving identically.
I think (and can't be sure) that until you actually call your_socket.write (which will not fail as the socket is still opened because you weren't probing for its possible destruction), the socket won't raise any error.
I tried to simulate this with nginx and manual TCP sockets. And look at that:
irb> sock = TCPSocket.new('127.0.0.1', 80)
=> #<TCPSocket:0xb743b824>
irb> sock.write("salut")
=> 5
irb> sock.read
=> "<html>\r\n<head><title>400 Bad Request</title></head>\r\n<body>\r\n</body>\r\n</html>\r\n"
# Here, I kill nginx
irb> sock.write("salut")
=> 5
irb> sock.read
=> ""
irb> sock.write("salut")
Errno::EPIPE: Broken pipe
So what's the conclusion from here? Unless you're actually expecting some data from the server, you're screwed to detect that you've lost the connection :)
To detect a gracefully close, you'll have to read from the socket - read returning 0 indicates the socket has closed.
If you do need know if data got sent successfully though, there's no way other than implementing ACKs of the data at the application level.

Resources