catch undefined method exception from a thread and restart it in ruby - ruby

I have an activemq topic subscriber in ruby which uses stomp protocol with failover to connect to the broker and if somehow the activemq gets restarted then sometimes i get an exception :
undefined method `command' for nil:NilClass
/usr/lib64/ruby/gems/1.8/gems/stomp-1.1.8/lib/stomp/client.rb:295:in `start_listeners'
/usr/lib64/ruby/gems/1.8/gems/stomp-1.1.8/lib/stomp/client.rb:108:in `join'
/usr/lib64/ruby/gems/1.8/gems/stomp-1.1.8/lib/stomp/client.rb:108:in `join'
./lib/active_mq_topic_reader.rb:31:in `active_mq_topic_reader'
main.rb:164
main.rb:163:in `initialize'
main.rb:163:in `new'
but i get this exception only when i use join() method on the broker thread, otherwise no exception appears and the subscriber get unsubscribed from the topic.
The problem which i am facing is i have a different mechanism of shutting down the process by sending shutdown signal, and till then the process waits, but if we use join() then the process will get stuck on this line and i will not be able to close the method by shutdown signal.So what should i do to catch the exception and restart the listener thread?
active_mq_topic_reader.rb :
require 'rubygems'
require 'ffi-rzmq'
require 'msgpack'
require 'zmq_helper'
require 'stomp'
include ZmqHelper
def active_mq_topic_reader(context, notice_agg_fe_url, signal_pub_url, monitor_url, active_mq_broker, topic)
begin
sender = create_connect_socket(context, ZMQ::PUSH, notice_agg_fe_url)
monitor = create_connect_socket(context, ZMQ::PUSH, monitor_url)
active_mq_broker.subscribe(topic, {}) do |msg|
notice = {}
["entity_id","entity_type","system_name","change_type"].each do |key|
notice[key] = msg.headers[key].to_s
end
monitor.send_string("qreader-#{topic.slice(7..-1)}")
sender.send_string(notice.to_msgpack)
end
active_mq_broker.join() #cannot use this
signal_subscriber.recv_string() #here the code waits for shutdown signal in case of process shutdown
sender.close()
monitor.close()
signal_subscriber.close()
active_mq_broker.unsubscribe(topic)
return
rescue Exception => e
puts "#{topic}: #{e}"
puts e.backtrace
$stdout.flush
end
end
main.rb :
context = ZMQ::Context.new(1)
active_mq_broker_audit = Stomp::Client.new("failover:(stomp://localhost:61613,stomp://localhost:61613)")
new_thread = Thread.new do
active_mq_topic_reader(context,
"inproc://notice_agg_fe",
"inproc://signal_pub",
"tcp://localhost:xxxx",
active_mq_broker_audit,
"/topic/myTopic")
end
new_thread.join()

Related

Gracefully unsubscribe from redis at exit

I have a ruby program which listens to a redis channel:
module Listener
class << self
def listen
redis.subscribe "messaging" do |on|
on.message do |_, msg|
Notify.about(msg)
end
end
end
def redis
#redis ||= Redis.new(driver: :hiredis)
end
end
end
Every time I deploy the app I restart the process with
kill -15 listener-pid
But Airbrake notifies me about the SignalException: SIGTERM with the following backtrace
/gems/hiredis-0.6.1/lib/hiredis/ext/connection.rb:19 in read
/gems/hiredis-0.6.1/lib/hiredis/ext/connection.rb:19 in read
/gems/redis-3.3.3/lib/redis/connection/hiredis.rb:54 in read
/gems/redis-3.3.3/lib/redis/client.rb:262 in block in read
/gems/redis-3.3.3/lib/redis/client.rb:250 in io
/gems/redis-3.3.3/lib/redis/client.rb:261 in read
/gems/redis-3.3.3/lib/redis/client.rb:136 in block (3 levels) in call_loop
/gems/redis-3.3.3/lib/redis/client.rb:135 in loop
/gems/redis-3.3.3/lib/redis/client.rb:135 in block (2 levels) in call_loop
/gems/redis-3.3.3/lib/redis/client.rb:231 in block (2 levels) in process
/gems/redis-3.3.3/lib/redis/client.rb:367 in ensure_connected
/gems/redis-3.3.3/lib/redis/client.rb:221 in block in process
/gems/redis-3.3.3/lib/redis/client.rb:306 in logging
/gems/redis-3.3.3/lib/redis/client.rb:220 in process
/gems/redis-3.3.3/lib/redis/client.rb:134 in block in call_loop
/gems/redis-3.3.3/lib/redis/client.rb:280 in with_socket_timeout
/gems/redis-3.3.3/lib/redis/client.rb:133 in call_loop
/gems/redis-3.3.3/lib/redis/subscribe.rb:43 in subscription
/gems/redis-3.3.3/lib/redis/subscribe.rb:12 in subscribe
/gems/redis-3.3.3/lib/redis.rb:2765 in _subscription
/gems/redis-3.3.3/lib/redis.rb:2143 in block in subscribe
/gems/redis-3.3.3/lib/redis.rb:58 in block in synchronize
/usr/lib/ruby/2.4.0/monitor.rb:214 in mon_synchronize
/gems/redis-3.3.3/lib/redis.rb:58 in synchronize
/gems/redis-3.3.3/lib/redis.rb:2142 in subscribe
Is it possible to restart the listener process gracefully so I wont receive SIGTERM errors?
I found a pubsub example in redis-rb
After I added trap('SIGTERM') { exit } the problem was fixed
Now my listener class looks like this:
module Listener
class << self
def listen
trap('SIGTERM') { exit }
redis.subscribe "messaging" do |on|
on.message do |_, msg|
Notify.about(msg)
end
end
end
def redis
#redis ||= Redis.new(driver: :hiredis)
end
end
end

Trouble Rescuing from Bunny Connection Net::ReadTimeout

I have a ruby script that uses the Bunny Gem to connect to a rabbitmq instance. The script works for a while, but eventually will die because of a Net::ReadTimeout
E, [2017-08-13T08:48:09.671988 #21351] ERROR -- #<Bunny::Session:0x39eca20 scrapes#104.196.154.25:5672, vhost=/, addresses=[104.196.154.25:5672]>: Uncaught exception from consumer #<Bunny::Consumer:32353120 #channel_id=1 #queue=sc_link_queue> #c
onsumer_tag=bunny-1502631967000-46739673895>: #<Net::ReadTimeout: Net::ReadTimeout> # /home/rails/.rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill'
E, [2017-08-13T08:48:32.468023 #23205] ERROR -- #<Bunny::Session:0x42202a0 scrapes#104.196.154.25:5672, vhost=/, addresses=[104.196.154.25:5672]>: Uncaught exception from consumer #<Bunny::Consumer:36695920 #channel_id=1 #queue=sc_link_queue> #c
onsumer_tag=bunny-1502631972000-482787698591>: #<Net::ReadTimeout: Net::ReadTimeout> # /home/rails/.rvm/rubies/ruby-2.3.3/lib/ruby/2.3.0/net/protocol.rb:158:in `rbuf_fill'
My script looks like this
module Sc
class Worker
def initialize
init()
end
def self.start_headless(type)
Headless.new(display: 50, destroy_at_exit: false, resuse: true).start
worker = new
worker.send(type)
end
def init
$conn ||= Bunny.new($rabbitmq_opts)
$conn.start
#browser = Sc::Browser.new()
rescue Timeout::Error, Net::ReadTimeout, Selenium::WebDriver::Error::UnknownError, Errno::ECONNREFUSED, Selenium::WebDriver::Error::JavascriptError, Exception, StandardError => e
LOGGER.error("[x] Trouble connecting to rabbitmq, retrying...")
LOGGER.error("[x] #{e}")
LOGGER.error("[x] #{e.backtrace}")
retry
end
def listen_for_searches
channel = $conn.create_channel
channel.prefetch(1)
queue = channel.queue($rabbitmq_search_queue, durable: true)
exchange = channel.default_exchange
queue.subscribe(:manual_ack => true, :block => true) do |delivery_info, properties, payload|
LOGGER.info "[x] Received #{payload}"
payload = JSON.parse(payload)
scrape = Sc::Search.new(browser: #browser.browser, county: payload["name"], type: payload["type"], date_type: payload["date_type"])
scrape.run
scrape.close
channel.ack(delivery_info.delivery_tag)
end
rescue Timeout::Error, Net::ReadTimeout, Selenium::WebDriver::Error::UnknownError, Errno::ECONNREFUSED, Selenium::WebDriver::Error::JavascriptError, Exception, StandardError => e
LOGGER.error("[x] #{e}")
LOGGER.error("[x] #{e.backtrace}")
LOGGER.error("[x] Trouble with scrape, retrying...")
retry
end
end
end
As you can see I'm trying to rescue from pretty much everything that could happen. I still can't seem to get it to recover from the Net::ReadTimeout error. Once the worker dies you can still see that it is connected to rabbitmq, but the last item it took from the queue is unacknowledged, it is essentially hung.
I have solved this. The issue was that everything that runs inside the Bunny subscribe block is handled in a different thread, so you need to add the rescue statements to inside that block.

Handle exceptions in concurrent-ruby thread pool

How to handle exceptions in concurrent-ruby thread pools (http://ruby-concurrency.github.io/concurrent-ruby/file.thread_pools.html)?
Example:
pool = Concurrent::FixedThreadPool.new(5)
pool.post do
raise 'something goes wrong'
end
# how to rescue this exception here
Update:
Here is simplified version of my code:
def process
pool = Concurrent::FixedThreadPool.new(5)
products.each do |product|
new_product = generate_new_product
pool.post do
store_in_db(new_product) # here exception is raised, e.g. connection to db failed
end
end
pool.shutdown
pool.wait_for_terminaton
end
So what I want to achive, is to stop processing (break loop) in case of any exception.
This exception is also rescued at higher level of application and there are executed some cleaning jobs (like setting state of model to failure and sending some notifications).
The following answer is from jdantonio from here https://github.com/ruby-concurrency/concurrent-ruby/issues/616
"
Most applications should not use thread pools directly. Thread pools are a low-level abstraction meant for internal use. All of the high-level abstractions in this library (Promise, Actor, etc.) all post jobs to the global thread pool and all provide exception handling. Simply pick the abstraction that best fits your use case and use it.
If you feel the need to configure your own thread pool rather than use the global thread pool, you can still use the high-level abstractions. They all support an :executor option which allows you to inject your custom thread pool. You can then use the exception handling provided by the high-level abstraction.
If you absolutely insist on posting jobs directly to a thread pool rather than using our high-level abstractions (which I strongly discourage) then just create a job wrapper. You can find examples of job wrappers in all our high-level abstractions, Rails ActiveJob, Sucker Punch, and other libraries which use our thread pools."
So how about an implementation with Promises ?
http://ruby-concurrency.github.io/concurrent-ruby/Concurrent/Promise.html
In your case it would look something like this:
promises = []
products.each do |product|
new_product = generate_new_prodcut
promises << Concurrent::Promise.execute do
store_in_db(new_product)
end
end
# .value will wait for the Thread to finish.
# The ! means, that all exceptions will be propagated to the main thread
# .zip will make one Promise which contains all other promises.
Concurrent::Promise.zip(*promises).value!
There may be a better way, but this does work. You will want to change the error handling within wait_for_pool_to_finish.
def process
pool = Concurrent::FixedThreadPool.new(10)
errors = Concurrent::Array.new
10_000.times do
pool.post do
begin
# do the work
rescue StandardError => e
errors << e
end
end
end
wait_for_pool_to_finish(pool, errors)
end
private
def wait_for_pool_to_finish(pool, errors)
pool.shutdown
until pool.shutdown?
if errors.any?
pool.kill
fail errors.first
end
sleep 1
end
pool.wait_for_termination
end
I've created an issue #634. Concurrent thread pool can support abortable worker without any problems.
require "concurrent"
Concurrent::RubyThreadPoolExecutor.class_eval do
# Inspired by "ns_kill_execution".
def ns_abort_execution aborted_worker
#pool.each do |worker|
next if worker == aborted_worker
worker.kill
end
#pool = [aborted_worker]
#ready.clear
stopped_event.set
nil
end
def abort_worker worker
synchronize do
ns_abort_execution worker
end
nil
end
def join
shutdown
# We should wait for stopped event.
# We couldn't use timeout.
stopped_event.wait nil
#pool.each do |aborted_worker|
# Rubinius could receive an error from aborted thread's "join" only.
# MRI Ruby doesn't care about "join".
# It will receive error anyway.
# We can "raise" error in aborted thread and than "join" it from this thread.
# We can "join" aborted thread from this thread and than "raise" error in aborted thread.
# The order of "raise" and "join" is not important. We will receive target error anyway.
aborted_worker.join
end
#pool.clear
nil
end
class AbortableWorker < self.const_get :Worker
def initialize pool
super
#thread.abort_on_exception = true
end
def run_task pool, task, args
begin
task.call *args
rescue StandardError => error
pool.abort_worker self
raise error
end
pool.worker_task_completed
nil
end
def join
#thread.join
nil
end
end
self.send :remove_const, :Worker
self.const_set :Worker, AbortableWorker
end
class MyError < StandardError; end
pool = Concurrent::FixedThreadPool.new 5
begin
pool.post do
sleep 1
puts "we shouldn't receive this message"
end
pool.post do
puts "raising my error"
raise MyError
end
pool.join
rescue MyError => error
puts "received my error, trace: \n#{error.backtrace.join("\n")}"
end
sleep 2
Output:
raising my error
received my error, trace:
...
This patch works fine for any version of MRI Ruby and Rubinius. JRuby is not working and I don't care. Please patch JRuby executor if you want to support it. It should be easy.

Testing sidekiq worker and retryset Rails minitest

I have a sidekiq middlware which catch custom exception
require 'celluloid'
require 'sidekiq/middleware/server/retry_jobs'
module Sidekiq
class RetryMiddleware < Sidekiq::Middleware::Server::RetryJobs
def call(worker, msg, queue)
yield
rescue Sidekiq::Shutdown
# ignore, will be pushed back onto queue during hard_shutdown
raise
rescue Sidekiq::Retries::Retry => e
# force a retry (for workers that have retries disabled)
msg['retry'] = e.max_retries
attempt_retry(worker, msg, queue, e.cause)
raise e.cause
rescue Sidekiq::Retries::Fail => e
# seriously, don't retry this
raise e.cause
rescue Exception => e
# ignore, will be pushed back onto queue during hard_shutdown
raise Sidekiq::Shutdown if exception_caused_by_shutdown?(e)
raise e unless msg['retry']
attempt_retry(worker, msg, queue, e)
raise e
end
end
end
and my worker looks like below
class SomeWorker
include Sidekiq::Worker
sidekiq_options retry: false
def perform(input_data)
begin
logic to insert data into db
rescue Ione::Io::ConnectionClosedError => e
raise Sidekiq::Retries::Retry
end
end
end
When i was trying to test the SomeWorker peform method is adding the job to retry.
In Testing I am not seeing the middleware is getting called
Thanks in advance
You're making this way too hard on yourself, just call the methods.
RetryMiddleware.new.call(MyWorker.new, { ... }, 'default') do
MyWorker.new.perform(...)
end

Making errors abort EventMachine process

I'm working on creating a background script that uses EventMachine to connect to a server with WebSockets. The script will be run using DelayedJob or Resque. I've been able to get it to talk to the WebSockets server and send messages, but whenever an error is raised within the EventMachine loop it doesn't crash the script - which is what should happen (and what I need to have happen). I don't have to use EventMachine as I'm only sending WebSocket messages and not receiving them - but I'd love any help on this :) thank you!
#!/usr/bin/env ruby
require 'rubygems'
require 'eventmachine'
require 'em-http'
class Job
include EventMachine::Deferrable
def self.perform
job = Job.new
EventMachine.run {
http = EventMachine::HttpRequest.new("ws://localhost:8080/").get :timeout => 0
http.errback { puts "oops" }
http.callback {
puts "WebSocket connected!"
http.send("Hello watcher")
}
http.stream { |msg| }
job.callback { puts "done" }
Thread.new {
job.execute(http)
http.close
EventMachine.stop
}
}
end
def execute(h)
sleep 1
puts "Job Runner!"
h.send("welcome!")
sleep 2
asdsadsa # here I am trying to simulate an error
sleep 1
h.send("we are all done!")
sleep 1
set_deferred_status :succeeded
end
end
Job.perform
Since you're causing an exception inside a thread, you should set Thread.abort_on_exception to true otherwise these errors will not be raised properly.
You don't need to use Thread.new here at all, in fact, it's not thread safe to do so (eventmachine itself is not thread safe, except for EM::Queue, EM::Channel and EM.schedule).
If you wanted to do synchronous things in execute, and you must have that thread, then, you'll want to call h.send via EM.schedule, for example:
EM.schedule { h.send("welcome!") }
If you must have that thread in this way, then, you want to catch exceptions from the thread you spawn yourself. You should then stop and shutdown on your own, or just raise back up in the main (eventmachine) thread:
EM.run do
thread = Thread.new do
raise 'boom'
end
EM.add_periodic_timer(0.1) { thread.join(0) }
end
The above pattern can easily just enumerate an array of threads in the periodic timer instead, if appropriate.
Finally, please note that exception bubbling (correct exception reporting) was only supported in EventMachine > 1.0, which is still in beta. To get usable backtraces when exceptions occur, either gem install eventmachine --pre, or better, use master from the Github repo.

Resources