Catching RabbitMQ connection loss mid Request on Passenger - ruby

I am using the bunny gem to publish messages to a RabbitMQ.
Following the recommendation given in the official documentation for use with Passenger in Rack apps, I added connection creation to be executed after a new worker process was started.
if defined?(PhusionPassenger)
PhusionPassenger.on_event(:starting_worker_process) do |forked|
if forked
# We’re in a smart spawning mode
# Now is a good time to connect to RabbitMQ
#
# Every process will get it's own connection!
$rabbitmq_connection = Bunny.new rabbit[settings.environment.to_s]
$rabbitmq_connection.start
end
end
else # For non passenger environments - e.g. specs or rackup
$rabbitmq_connection = Bunny.new rabbit[settings.environment.to_s]
$rabbitmq_connection.start
end
This works pretty well, however when the connection is lost mid request (before the message could be published) no exception is caught. The process just seems to die and return the generic apache error page - logging doesn't work anymore either.
Thus in my specific case a database entry was created but I could not write a log message cleanly indicating that one and which message could not be published.
One workaround I found is to just create the connection to rabbitmq on a per request basis by establishing the connection directly before actually publishing the message.
That doesn't seem to be very efficient though, given that passenger worker processes handle more than a single request before they are discarded.
It does however properly catch the exception, log it and continue on with the request handling
def publish_message(exchange, key, message)
rabbitmq_connection = Bunny.new $rabbit
rabbitmq_connection.start
ch = rabbitmq_connection.create_channel
exchange = ch.exchange(exchange, durable: true, type: :topic)
exchange.publish(message, routing_key: key, persistent: true)
ch.close
rabbitmq_connection.close
rescue Exception => e
$log_file.error "Sending message #{message} to #{exchange} with key #{key} failed: #{e.message}"
end
Now I am wondering how to catch this when following the recommended approach
and whether there is any other best practice for Rack Apps with passenger which I just haven't found yet.
I'd appreciate any hints leading me to finding a better solution than my workaround.

Related

Sinatra, Puma, ActiveRecord: No connection pool with 'primary' found

I am building a service in Ruby 2.4.4, with Sinatra 2.0.5, ActiveRecord 5.2.2, Puma 3.12.0. (I'm not using rails.)
My code looks like this. I have an endpoint which opens a DB connection (to a Postgres DB) and runs some DB queries, like this:
POST '/endpoint' do
# open a connection
ActiveRecord::Base.establish_connection(##db_configuration)
# run some queries
db_value = TableModel.find_by(xx: yy)
return whatever
end
after do
# after the endpoint finishes, close all open connections
ActiveRecord::Base.clear_all_connections!
end
When I get two parallel requests to this endpoint, one of them fails with this error:
2019-01-12 00:22:07 - ActiveRecord::ConnectionNotEstablished - No connection pool with 'primary' found.:
C:/Ruby24-x64/lib/ruby/gems/2.4.0/gems/activerecord-5.2.2/lib/active_record/connection_adapters/abstract/connection_pool.rb:1009:in `retrieve_connection'
C:/Ruby24-x64/lib/ruby/gems/2.4.0/gems/activerecord-5.2.2/lib/active_record/connection_handling.rb:118:in `retrieve_connection'
C:/Ruby24-x64/lib/ruby/gems/2.4.0/gems/activerecord-5.2.2/lib/active_record/connection_handling.rb:90:in `connection'
C:/Ruby24-x64/lib/ruby/gems/2.4.0/gems/activerecord-5.2.2/lib/active_record/core.rb:207:in `find_by'
...
My discovery process went this way so far.
I looked at the connection usage in Postgres, thinking I might leak connections - no, I didn't seem to.
Just in case, I increased the connection pool to 16 (corresponding to 16 Puma threads) - didn't help.
Then I looked into the ActiveRecord sources. Here I realized why 2) didn't help. The problem is not that I can't get a connection, but I can't get a connection pool (yes, yes, it says that in the exception). The #owner_to_pool map variable, from which a connection pool is obtained, stores the process_id as key, and as values - connection pools (actually, the value is also a map, where the key is a connection specification and the value, I presume, is an actual pool instance). In my case, I have only one connection spec to my only db.
But Puma is a multithreaded webserver. It runs all requests in the same process but in different threads.
Because of that, I think, the following happens:
The first request, starting in process_id=X, thread=Y, "checks out" the connection pool in establish_connection, based on process_id=X, - and "takes" it. Now it's not present in the #owner_to_pool.
The second request, starting in the same process_id=X, but different thread=Z, tries to do the same - but the connection pool for process_id=X is not present in owner_to_pool. So the second request doesn't get a connection pool and fails with that exception.
The first request finished successfully and puts the connection pool for process_id=X back in place by calling clear_all_connections.
Another request, starting after all that, and not having any parallel requests in parallel threads, will succeed, because it will pick up the connection pool and put it back again with no problems.
Although I am not sure I understand everything 100% correctly, but it seems to me that something like this happens.
Now, my question is: what do I do with all this? How do I make the multithreaded Puma webserver work correctly with ActiveRecord's connection pool?
Thanks a lot in advance!
This question seems similar, but unfortunately it doesn't have an answer, and I don't have enough reputation to comment on it and ask the author if they solved it.
So, basically, I didn't realize I was establish_connection is creating a connection pool. (Yes, yes, I said so myself in the question. Still, didn't quite realize it.)
What I ended up doing, is this:
require ....
# create the connection pool with the required configuration - once; it'll belong to the process
ActiveRecord::Base.establish_connection(db_configuration)
at_exit {
# close all connections on app exit
ActiveRecord::Base.clear_all_connections!
}
class SomeClass < Sinatra::Base
POST '/endpoint' do
# run some queries - they'll automatically use a connection from the pool
db_value = TableModel.find_by(xx: yy)
return whatever
end
end

SSL_write: bad write retry. Exception. Readin e-mails with IMAP IDLE. In Ruby

I woul like to get unseen mails "as soon as possible", using a Ruby (2.1) script to implement IMAP IDLE ("push notify") feature.
With the help of some guys (see also: Support for IMAP IDLE in ruby), I wrote the script here:
https://gist.github.com/solyaris/b993283667f15effa579
def idle_loop(imap, search_condition, folder)
# https://stackoverflow.com/questions/4611716/how-imap-idle-works
loop do
begin
imap.select folder
imap.idle do |resp|
#trap_shutdown
# You'll get all the things from the server.
#For new emails you're only interested in EXISTS ones
if resp.kind_of?(Net::IMAP::UntaggedResponse) and resp.name == "EXISTS"
# Got something. Send DONE. This breaks you out of the blocking call
imap.idle_done
end
end
# We're out, which means there are some emails ready for us.
# Go do a search for UNSEEN and fetch them.
retrieve_emails(imap, search_condition, folder) { |mail| process_email mail}
#rescue Net::IMAP::Error => imap_err
# Socket probably timed out
# puts "IMAP IDLE socket probably timed out.".red
rescue SignalException => e
# https://stackoverflow.com/questions/2089421/capturing-ctrl-c-in-ruby
puts "Signal received at #{time_now}: #{e.class} #{e.message}".red
shutdown imap
rescue Exception => e
puts "Something went wrong at #{time_now}: #{e.class} #{e.message}".red
imap.noop
end
end
end
Now, all run smootly at first glance, BUT I have the exception
Something went wrong: SSL_write: bad write retry
at this line in code:
https://gist.github.com/solyaris/b993283667f15effa579#file-idle-rb-L189
The error happen when I leave the script running for more than... say more than 30 minutes.
BTW, the server is imap.gmail.com (arghh...), and I presume is something related to IMAP IDLE reconnection socket (I din't read yet the ruby UMAP library code) but I do not understand the reason of the exception;
Any idea for the reason if the exception ? Just trap the exception to fix the issue ?
thanks
giorgio
UPDATE
I modified a bit the exception handling (see gist code: https://gist.github.com/solyaris/b993283667f15effa579)
Now I got a Net::IMAP::Error connection closed I just restart the IMAP connection and it seems working...
Sorry for confusing, anyway in general any comments on code I wrote, IDLE protocol correct management, are welcome.
The IMAP IDLE RFC says to stop IDLE after at most 29 minutes and reissue a new IDLE command. IMAP servers are permitted to assume that the client is dead and has gone away after 31 minutes of inactivity.
You may also find that some NAT middleboxes silently sabotage your connection long before the half-hour is up, I've seen timeouts as short as about two minutes. (Every time I see something like that I scream "vivat ipv6!") I don't think there's any good solution for those middleboxes, except maybe to infect them with a vile trojan, but the bad solutions include adjusting your idle timeout if you get the SSL exception before a half-hour is up.

How can I properly handle persistent TCP socket connections (to simulate an HTTP server)?

So, I'm trying to simulate some basic HTTP persistent connections using sockets and Ruby - for a college class.
The point is to build a server - able to handle multiple clients - that receives a file path and gives back the file content - just like an HTTP GET.
The current server implementation loops listening for clients, fires a new thread when there's an incoming connection and reads the file paths from this socket. It's very dumb, but it works fine when working with non-presistent connections - one request per connection.
But they should be persistent.
Which means the client shouldn't worry about closing the connection. In the non-persistent version the servers echoes the response and close the connection - goodbye client, farewell.
But being persistent means the server thread should loop and wait for more incoming requests until... well until there's no more requests. How does the server knows that? It doesn't! Some sort of timeout is needed. I tried to do that with Ruby's Timeout, but it didn't work.
Googling for some solutions - besides being thoroughly advised to avoid using Timeout module - I've seen a lot of posts about the IO.select method, that should handle this socket waiting issue way better than using threads and stuff (which really sounds cool, considering how Ruby threads (don't) work). I'm trying to understand here how IO.select works, but still wasn't able to make it work in the current scenario.
So I aske basically two things:
how can I efficiently work this timeout issue on the server-side, either using some thread based solution, low-level socket options or some IO.select magic?
how can the client side know that the server has closed its side of the connection?
Here's the current code for the server:
require 'date'
module Sockettp
class Server
def initialize(dir, port = Sockettp::DEFAULT_PORT)
#dir = dir
#port = port
end
def start
puts "Starting Sockettp server..."
puts "Serving #{#dir.yellow} on port #{#port.to_s.green}"
Socket.tcp_server_loop(#port) do |socket, client_addrinfo|
handle socket, client_addrinfo
end
end
private
def handle(socket, addrinfo)
Thread.new(socket) do |client|
log "New client connected"
begin
loop do
if client.eof?
puts "#{'-' * 100} end connection"
break
end
input = client.gets.chomp
body = content_for(input)
response = {}
if body
response.merge!({
status: 200,
body: body
})
else
response.merge!({
status: 404,
body: Sockettp::STATUSES[404]
})
end
log "#{addrinfo.ip_address} #{input} -- #{response[:status]} #{Sockettp::STATUSES[response[:status]]}".send(response[:status] == 200 ? :green : :red)
client.puts(response.to_json)
end
ensure
socket.close
end
end
end
def content_for(path)
path = File.join(#dir, path)
return File.read(path) if File.file?(path)
return Dir["#{path}/*"] if File.directory?(path)
end
def log(msg)
puts "#{Thread.current} -- #{DateTime.now.to_s} -- #{msg}"
end
end
end
Update
I was able to simulate the timeout behaviour using the IO.select method, but the implementation doesn't feel good when combining with a couple of threads for accepting new connections and another couple for handling requests. The concurrency makes the situation mad and unstable, and I'm probably not sticking with it unless I can figure out a better way of using this solution.
Update 2
Seems like Timeout is still the best way to handle this. I'm sticking with it till find a better option.
I still don't know how to deal with zombie client connections.
Solution
I endend up using IO.select (got inspired when looking at the webrick code). You cha check the final version here (lib/http/server/client_handler.rb)
You should implement something like heartbeat packets.Client side should send special packets to after few secs/mins to ensure that server doesn't time out the connection on the client end.You just avoid doing anything in this call.

reconnect tcpsocket (or how to detect closed socket)

I have a ruby tcpsocket client that is connected to a server.
How can I check to see if the socket is connected before I send the data ?
Do I try to "rescue" a disconnected tcpsocket, reconnect and then resend ? if so, does anyone have a simple code sample as I don't know where to begin :(
I was quite proud that I managed to get a persistent connected client tcpsocket in rails. Then the server decided to kill the client and it all fell apart ;)
edit
I've used this code to get round some of the problems - it will try to reconnect if not connected, but won't handle the case if the server is down (it will keep retrying). Is this the start of the right approach ? Thanks
def self.write(data)
begin
##my_connection.write(data)
rescue Exception => e
##my_connection = TCPSocket.new 'localhost', 8192
retry
end
end
What I usually do in these types of scenarios is keep track of consecutive retries in a variable and have some other variable that sets the retry roof. Once we hit the roof, throw some type of exception that indicates there is a network or server problem. You'll want to reset the retry count variable on success of course.

Posting large number of messages to AMQP queue

Using v0.7.1 of the Ruby amqp library and Ruby 1.8.7, I am trying to post a large number (millions) of short (~40 bytes) messages to a RabbitMQ server. My program's main loop (well, not really a loop, but still) looks like this:
AMQP.start(:host => '1.2.3.4',
:username => 'foo',
:password => 'bar') do |connection|
channel = AMQP::Channel.new(connection)
exchange = channel.topic("foobar", {:durable => true})
i = 0
EM.add_periodic_timer(1) do
print "\rPublished #{i} commits"
end
results = get_results # <- Returns an array
processor = proc do
if x = results.shift then
exchange.publish(x, :persistent => true,
:routing_key => "test.#{i}")
i += 1
EM.next_tick processor
end
end
EM.next_tick(processor)
AMQP.stop {EM.stop} end
The code starts processing the results array just fine, but after a while (usually, after 12k messages or so) it dies with the following error
/Library/Ruby/Gems/1.8/gems/amqp-0.7.1/lib/amqp/channel.rb:807:in `send':
The channel 1 was closed, you can't use it anymore! (AMQP::ChannelClosedError)
No messages are stored on the queue. The error seems to be happening just when network activity from the program to the queue server starts.
What am I doing wrong?
First mistake is that you didn't post the RabbitMQ version that you are using. Lots of people are running old obsolete version 1.7.2 because that is what is in their OS package repositories. Bad move for anyone sending the volume of messages that you are. Get RabbitMQ 2.5.1 from the RabbitMQ site itself and get rid of your default system package.
Second mistake is that you did not tell us what is in the RabbitMQ logs.
Third mistake is that you said nothing about what is consuming the messages. Is there another process running somewhere that has declared a queue and bound it to the exchange. There is NO message queue unless somebody declares it to RabbitMQ and binds it to an exchange. Even then messages will only flow if the binding key for the queue matches the routing key that you publish with.
Fourth mistake. You have routing keys and binding keys mixed up. The routing key is a string such as topic.test.json.echos and the binding key (used to bind a queue to an exchange) is a pattern like topic.# or topic..json.
Updated after your clarifications
Regarding versions, I'm not sure when it was fixed but there was a problem in 1.7.2 with large numbers of persistent messages causing RabbitMQ to crash when it rolled over its persistence log, and after crashing it was unable to restart until someone manually undid the rollover.
When you say that a connection is being opened and closed, I hope that it is not per message. That would be a strange way to use AMQP.
Let me repeat. Producers do NOT write messages to queues. They write messages to exchanges which then route the messages to queues based on the routing key (string) and the queue's binding key (pattern). In your example I misread the use of the # sign, but I see nothing which declares a queue and binds it to the exchange.

Resources