Trouble Rescuing from Bunny Connection Net::ReadTimeout - ruby

I have a ruby script that uses the Bunny Gem to connect to a rabbitmq instance. The script works for a while, but eventually will die because of a Net::ReadTimeout
E, [2017-08-13T08:48:09.671988 #21351] ERROR -- #<Bunny::Session:0x39eca20 scrapes#104.196.154.25:5672, vhost=/, addresses=[104.196.154.25:5672]>: Uncaught exception from consumer #<Bunny::Consumer:32353120 #channel_id=1 #queue=sc_link_queue> #c
onsumer_tag=bunny-1502631967000-46739673895>: #<Net::ReadTimeout: Net::ReadTimeout> # /home/rails/.rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill'
E, [2017-08-13T08:48:32.468023 #23205] ERROR -- #<Bunny::Session:0x42202a0 scrapes#104.196.154.25:5672, vhost=/, addresses=[104.196.154.25:5672]>: Uncaught exception from consumer #<Bunny::Consumer:36695920 #channel_id=1 #queue=sc_link_queue> #c
onsumer_tag=bunny-1502631972000-482787698591>: #<Net::ReadTimeout: Net::ReadTimeout> # /home/rails/.rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill'
My script looks like this
module Sc
class Worker
def initialize
init()
end
def self.start_headless(type)
Headless.new(display: 50, destroy_at_exit: false, resuse: true).start
worker = new
worker.send(type)
end
def init
$conn ||= Bunny.new($rabbitmq_opts)
$conn.start
#browser = Sc::Browser.new()
rescue Timeout::Error, Net::ReadTimeout, Selenium::WebDriver::Error::UnknownError, Errno::ECONNREFUSED, Selenium::WebDriver::Error::JavascriptError, Exception, StandardError => e
LOGGER.error("[x] Trouble connecting to rabbitmq, retrying...")
LOGGER.error("[x] #{e}")
LOGGER.error("[x] #{e.backtrace}")
retry
end
def listen_for_searches
channel = $conn.create_channel
channel.prefetch(1)
queue = channel.queue($rabbitmq_search_queue, durable: true)
exchange = channel.default_exchange
queue.subscribe(:manual_ack => true, :block => true) do |delivery_info, properties, payload|
LOGGER.info "[x] Received #{payload}"
payload = JSON.parse(payload)
scrape = Sc::Search.new(browser: #browser.browser, county: payload["name"], type: payload["type"], date_type: payload["date_type"])
scrape.run
scrape.close
channel.ack(delivery_info.delivery_tag)
end
rescue Timeout::Error, Net::ReadTimeout, Selenium::WebDriver::Error::UnknownError, Errno::ECONNREFUSED, Selenium::WebDriver::Error::JavascriptError, Exception, StandardError => e
LOGGER.error("[x] #{e}")
LOGGER.error("[x] #{e.backtrace}")
LOGGER.error("[x] Trouble with scrape, retrying...")
retry
end
end
end
As you can see I'm trying to rescue from pretty much everything that could happen. I still can't seem to get it to recover from the Net::ReadTimeout error. Once the worker dies you can still see that it is connected to rabbitmq, but the last item it took from the queue is unacknowledged, it is essentially hung.

I have solved this. The issue was that everything that runs inside the Bunny subscribe block is handled in a different thread, so you need to add the rescue statements to inside that block.

Related

Sidekiq transient vs fatal errors

Is there a way to err from a Sidekiq job in a way that tells Sidekiq that "this error is fatal and unrecoverable, do not retry, send it straight to dead job queue"?
Looking at Sidekiq Error Handling documentation, it seems like it interpret all errors as transient, and will retry a job (if retry is enabled) regardless of the error type.
You should rescue those specific errors and not re-raise them.
def perform
call_something
rescue CustomException
nil
end
Edit:
Well, if you want to purposely send a message to the DLQ/DJQ, you'd need to make a method that does what #send_to_morgue does. I'm sure Mike Perham is going to come in here and yell at me for suggesting this but...
def send_to_morgue(msg)
Sidekiq.logger.info { "Adding dead #{msg['class']} job #{msg['jid']}" }
payload = Sidekiq.dump_json(msg)
now = Time.now.to_f
Sidekiq.redis do |conn|
conn.multi do
conn.zadd('dead', now, payload)
conn.zremrangebyscore('dead', '-inf', now - DeadSet.timeout)
conn.zremrangebyrank('dead', 0, -DeadSet.max_jobs)
end
end
end
The only difference you'd have to dig into what msg looks like going into that method but I suspect it's what normally hits the middleware before parse.
If found on GitHub a solution for your problem. In that post they suggested to write a custom middleware that handles the exceptions you want to prevent retries for. This is a basic example:
def call(worker, msg, queue)
begin
yield
rescue ActiveRecord::RecordNotFound => e
msg['retry'] = false
raise
end
end
You can extending that you get:
def call(worker, msg, queue)
begin
yield
rescue ActiveRecord::RecordNotFound => e
msg['retry'] = false
raise
rescue Exception => e
if worker.respond_to?(:handle_error)
worker.handle_error(e)
else
raise
end
end
end

Testing sidekiq worker and retryset Rails minitest

I have a sidekiq middlware which catch custom exception
require 'celluloid'
require 'sidekiq/middleware/server/retry_jobs'
module Sidekiq
class RetryMiddleware < Sidekiq::Middleware::Server::RetryJobs
def call(worker, msg, queue)
yield
rescue Sidekiq::Shutdown
# ignore, will be pushed back onto queue during hard_shutdown
raise
rescue Sidekiq::Retries::Retry => e
# force a retry (for workers that have retries disabled)
msg['retry'] = e.max_retries
attempt_retry(worker, msg, queue, e.cause)
raise e.cause
rescue Sidekiq::Retries::Fail => e
# seriously, don't retry this
raise e.cause
rescue Exception => e
# ignore, will be pushed back onto queue during hard_shutdown
raise Sidekiq::Shutdown if exception_caused_by_shutdown?(e)
raise e unless msg['retry']
attempt_retry(worker, msg, queue, e)
raise e
end
end
end
and my worker looks like below
class SomeWorker
include Sidekiq::Worker
sidekiq_options retry: false
def perform(input_data)
begin
logic to insert data into db
rescue Ione::Io::ConnectionClosedError => e
raise Sidekiq::Retries::Retry
end
end
end
When i was trying to test the SomeWorker peform method is adding the job to retry.
In Testing I am not seeing the middleware is getting called
Thanks in advance
You're making this way too hard on yourself, just call the methods.
RetryMiddleware.new.call(MyWorker.new, { ... }, 'default') do
MyWorker.new.perform(...)
end

catch undefined method exception from a thread and restart it in ruby

I have an activemq topic subscriber in ruby which uses stomp protocol with failover to connect to the broker and if somehow the activemq gets restarted then sometimes i get an exception :
undefined method `command' for nil:NilClass
/usr/lib64/ruby/gems/1.8/gems/stomp-1.1.8/lib/stomp/client.rb:295:in `start_listeners'
/usr/lib64/ruby/gems/1.8/gems/stomp-1.1.8/lib/stomp/client.rb:108:in `join'
/usr/lib64/ruby/gems/1.8/gems/stomp-1.1.8/lib/stomp/client.rb:108:in `join'
./lib/active_mq_topic_reader.rb:31:in `active_mq_topic_reader'
main.rb:164
main.rb:163:in `initialize'
main.rb:163:in `new'
but i get this exception only when i use join() method on the broker thread, otherwise no exception appears and the subscriber get unsubscribed from the topic.
The problem which i am facing is i have a different mechanism of shutting down the process by sending shutdown signal, and till then the process waits, but if we use join() then the process will get stuck on this line and i will not be able to close the method by shutdown signal.So what should i do to catch the exception and restart the listener thread?
active_mq_topic_reader.rb :
require 'rubygems'
require 'ffi-rzmq'
require 'msgpack'
require 'zmq_helper'
require 'stomp'
include ZmqHelper
def active_mq_topic_reader(context, notice_agg_fe_url, signal_pub_url, monitor_url, active_mq_broker, topic)
begin
sender = create_connect_socket(context, ZMQ::PUSH, notice_agg_fe_url)
monitor = create_connect_socket(context, ZMQ::PUSH, monitor_url)
active_mq_broker.subscribe(topic, {}) do |msg|
notice = {}
["entity_id","entity_type","system_name","change_type"].each do |key|
notice[key] = msg.headers[key].to_s
end
monitor.send_string("qreader-#{topic.slice(7..-1)}")
sender.send_string(notice.to_msgpack)
end
active_mq_broker.join() #cannot use this
signal_subscriber.recv_string() #here the code waits for shutdown signal in case of process shutdown
sender.close()
monitor.close()
signal_subscriber.close()
active_mq_broker.unsubscribe(topic)
return
rescue Exception => e
puts "#{topic}: #{e}"
puts e.backtrace
$stdout.flush
end
end
main.rb :
context = ZMQ::Context.new(1)
active_mq_broker_audit = Stomp::Client.new("failover:(stomp://localhost:61613,stomp://localhost:61613)")
new_thread = Thread.new do
active_mq_topic_reader(context,
"inproc://notice_agg_fe",
"inproc://signal_pub",
"tcp://localhost:xxxx",
active_mq_broker_audit,
"/topic/myTopic")
end
new_thread.join()

Unable to make socket Accept Non Blocking ruby 2.2

I have been searching the whole day for socket accept non blocking. I found recv non blocking but that wouldn't benefit me in anyway. My script first starts a new socket class. It binds to the client with ip 127.0.0.1 and port 6112. Then it starts multi threading. Multi threading takes #sock.accept. << That is blocking. I have then used accept_nonblock. Though, that would throw me the following error:
IO::EWOULDBLOCKWaitReadable : A non-blocking socket operation could not be completed immediately. - accept(2) would block
I am using Ruby 2.2.
NOTE: I do not intend to use Rails to solve my problem, or give me a shortcut. I am sticking with pure Ruby 2.2.
Here is my script:
require 'socket'
include Socket::Constants
#sock = Socket.new(AF_INET, SOCK_STREAM, 0)
#sockaddr = Socket.sockaddr_in(6112, '127.0.0.1')
#sock.bind(#sockaddr)
#sock.listen(5)
Thread.new(#sock.accept_nonblock) do |connection|
#client = Client.new(ip, connection, self)
#clients.push(#client)
begin
while connection
packet = connection.recv(55555)
if packet == nil
DeleteClient(connection)
else
#toput = "[RECV]: #{packet}"
puts #toput
end
end
rescue Exception => e
if e.class != IOError
line1 = e.backtrace[0].split(".rb").last
line = line1.split(":")[1]
#Log.Error(e, e.class, e.backtrace[0].split(".rb").first + ".rb",line)
puts "#{ e } (#{ e.class })"
end
end
def DeleteClient(connection)
#clients.delete(#client)
connection.close
end
http://docs.ruby-lang.org/en/2.2.0/Socket.html#method-i-accept_nonblock
accept_nonblock raises an exception when it can't immediately accept a connection. You are expected to rescue this exception and then IO.select the socket.
begin # emulate blocking accept
client_socket, client_addrinfo = socket.accept_nonblock
rescue IO::WaitReadable, Errno::EINTR
IO.select([socket])
retry
end
A patch has recently been accepted which will add an exception: false option to accept_nonblock, which will allow you to use it without using exceptions for flow control. I don't know that it's shipped yet, though.
I'm going on a limb here, and posting a large chunk of code.
I hope it will answer both your question and the any related questions others reading this answer might raise.
I'm sorry if I went overboard, I just thought it was almost all relevant.
Issues like looping through an event stack, using IO.select to push events in a non-block manner and other performance issues are all related (in my opinion) to the nonblocking concept of socket programming.
So i'm posting a ruby module which acts as a server with a reactor, using a limited number of threads, rather than thousands of threads, each per connection (12 threads will give you better performance than a hundred). The reactor utilizes the IO.select method with a timeout once all it's active events are handled.
The module can set up multiple listening sockets which use #accept_nonblock, and they all currently act as an echo server.
It's basically the same code I used for the Plezi framework's core... with some stripped down functionality.
The following is a thread-pool with 12 working threads + the main thread (which will sleep and wait for the "TERM" signal)...
...And it's an example of an accept_nonblock with exception handling and a thread pool.
It's a simple socket echo server, test it as a remote client using telnet:
> telnet localhost 3000
Hi!
# => Hi!
bye
#=> will disconnect
here's the code - Good Luck!!!
require 'socket'
module SmallServer
module_function
####
# Replace this method with your actual server logic.
#
# this code will be called when a socket recieves data.
#
# For now, we will just echo.
def got_data io, io_params
begin
got = io.recv_nonblock( 1048576 ) # with maximum number of bytes to read at a time...
puts "echoing: #{got}"
if got.match /^(exit|bye|q)\R/
puts 'closing connection.'
io.puts "bye bye!"
remove_connection io
else
io.puts "echoing: #{got}"
end
rescue => e
# should also log error
remove_connection io
end
end
#########
# main loop and activation code
#
# This will create a thread pool and set them running.
def start
# prepare threads
exit_flag = false
max_threads = 12
threads = []
thread_cycle = Proc.new do
io_review rescue false
true while fire_event
end
(max_threads).times { Thread.new { thread_cycle.call until exit_flag } }
# set signal tarps
trap('INT'){ exit_flag = true; raise "close!" }
trap('TERM'){ exit_flag = true; raise "close!" }
puts "Services running. Press ^C to stop"
# sleep until trap raises exception (cycling might cause the main thread to loose signals that might be caught inside rescue clauses)
(sleep unless SERVICES.empty?) rescue true
# start shutdown.
exit_flag = true
# set fallback tarps
trap('INT'){ puts 'Forced exit.'; Kernel.exit }
trap('TERM'){ puts 'Forced exit.'; Kernel.exit }
puts 'Started shutdown process. Press ^C to force quit.'
# shut down listening sockets
stop_services
# disconnect active connections
stop_connections
# cycle down threads
puts "Waiting for workers to cycle down"
threads.each {|t| t.join if t.alive?}
# rundown any active events
thread_cycle.call
end
#######################
## Events (Callbacks) / Multi-tasking Platform
EVENTS = []
E_LOCKER = Mutex.new
# returns true if there are any unhandled events
def events?
E_LOCKER.synchronize {!EVENTS.empty?}
end
# pushes an event to the event's stack
# if a block is passed along, it will be used as a callback: the block will be called with the values returned by the handler's `call` method.
def push_event handler, *args, &block
if block
E_LOCKER.synchronize {EVENTS << [(Proc.new {|a| push_event block, handler.call(*a)} ), args]}
else
E_LOCKER.synchronize {EVENTS << [handler, args]}
end
end
# Runs the block asynchronously by pushing it as an event to the event's stack
#
def run_async *args, &block
E_LOCKER.synchronize {EVENTS << [ block, args ]} if block
!block.nil?
end
# creates an asynchronous call to a method, with an optional callback (shortcut)
def callback object, method, *args, &block
push_event object.method(method), *args, &block
end
# event handling FIFO
def fire_event
event = E_LOCKER.synchronize {EVENTS.shift}
return false unless event
begin
event[0].call(*event[1])
rescue OpenSSL::SSL::SSLError => e
puts "SSL Bump - SSL Certificate refused?"
rescue Exception => e
raise if e.is_a?(SignalException) || e.is_a?(SystemExit)
error e
end
true
end
#####
# Reactor
#
# IO review code will review the connections and sockets
# it will accept new connections and react to socket input
IO_LOCKER = Mutex.new
def io_review
IO_LOCKER.synchronize do
return false unless EVENTS.empty?
united = SERVICES.keys + IO_CONNECTION_DIC.keys
return false if united.empty?
io_r = (IO.select(united, nil, united, 0.1) )
if io_r
io_r[0].each do |io|
if SERVICES[io]
begin
callback self, :add_connection, io.accept_nonblock, SERVICES[io]
rescue Errno::EWOULDBLOCK => e
rescue => e
# log
end
elsif IO_CONNECTION_DIC[io]
callback(self, :got_data, io, IO_CONNECTION_DIC[io] )
else
puts "what?!"
remove_connection(io)
SERVICES.delete(io)
end
end
io_r[2].each { |io| (remove_connection(io) || SERVICES.delete(io)).close rescue true }
end
end
callback self, :clear_connections
true
end
#######################
# IO - listening sockets (services)
SERVICES = {}
S_LOCKER = Mutex.new
def add_service port = 3000, parameters = {}
parameters[:port] ||= port
parameters.update port if port.is_a?(Hash)
service = TCPServer.new(parameters[:port])
S_LOCKER.synchronize {SERVICES[service] = parameters}
callback Kernel, :puts, "Started listening on port #{port}."
true
end
def stop_services
puts 'Stopping services'
S_LOCKER.synchronize {SERVICES.each {|s, p| (s.close rescue true); puts "Stoped listening on port #{p[:port]}"}; SERVICES.clear }
end
#####################
# IO - Active connections handling
IO_CONNECTION_DIC = {}
C_LOCKER = Mutex.new
def stop_connections
C_LOCKER.synchronize {IO_CONNECTION_DIC.each {|io, params| io.close rescue true} ; IO_CONNECTION_DIC.clear}
end
def add_connection io, more_data
C_LOCKER.synchronize {IO_CONNECTION_DIC[io] = more_data} if io
end
def remove_connection io
C_LOCKER.synchronize { IO_CONNECTION_DIC.delete io; io.close rescue true }
end
# clears closed connections from the stack
def clear_connections
C_LOCKER.synchronize { IO_CONNECTION_DIC.delete_if {|c| c.closed? } }
end
end
start the echo server in irb with:
SmallServer.add_service(3000) ; SmallServer.start

EventMachine and em-websocket - reading from a queue and pushing to a channel

I'm using eventmachine to read from a HornetQ topic, push to a Channel which is subscribed to by EM websocket connections. I need to prevent the #topic.receive loop from blocking, so have created a proc and am calling EventMachine.defer with no callback. This will run indefinitely. This works fine. I could also have just used Thread.new.
My question is, is this the correct way to read from a stream/queue and pass the data to the channel and is there a better/any other way to do this?
require 'em-websocket'
require 'torquebox-messaging'
class WebsocketServer
def initialize
#channel = EM::Channel.new
#topic = TorqueBox::Messaging::Topic.new('/topics/mytopic')
end
def start
EventMachine.run do
topic_to_channel = proc do
while true
msg = #topic.receive
#channel.push msg
end
end
EventMachine.defer(topic_to_channel)
EventMachine::WebSocket.start(:host => "127.0.0.1", :port => 8081, :debug => false) do |connection|
connection.onopen do
sid = #channel.subscribe { |msg| connection.send msg }
connection.onclose do
#channel.unsubscribe(sid)
end
end
end
end
end
end
WebsocketServer.new.start
This is ok, but EM.defer will spawn 20 threads, so I would avoid it for your use case. In general I would avoid EM entirely, especially the Java reactor as we never finished it.
The Torquebox has a native stomp over websockets solution that would be a much better way to go in this context, and solves a bunch of other encapsulation challenges for you.
If you really want to stick with EM for this, then I'd use Thread.new instead of defer, so as to avoid having 19 idle threads taking up extra ram for no reason.

Resources