Ruby + AMQP: processing queue in parallel - ruby

Since most of my tasks depends on the network, I want to process my queue in parallel, not just one message at a time.
So, I'm using the following code:
#!/usr/bin/env ruby
# encoding: utf-8
require "rubygems"
require 'amqp'
EventMachine.run do
connection = AMQP.connect(:host => '127.0.0.1')
channel = AMQP::Channel.new(connection)
channel.prefetch 5
queue = channel.queue("pending_checks", :durable => true)
exchange = channel.direct('', :durable => true)
queue.subscribe(:ack => true) do |metadata, payload|
time = rand(3..9)
puts 'waiting ' + time.to_s + ' for message ' + payload
sleep(time)
puts 'done with '+ payoad
metadata.ack
end
end
Why it is not using my prefetch setting? I guess it should get 5 messages and process them in parallel, no?

Prefetch is the maximum number of messages that may be sent to you in advance before you ack.
In other words, the prefetch size does not limit the transfer of single messages to a client, only the sending in advance of more messages while the client still has one or more unacknowledged messages. (From AMPQ docs)
QoS Prefetching Messages
RabbitMQ AMQP Reference
EventMachine is single threaded and event based. For parallel jobs on different threads or processes, see EM::Deferrable, then Thread or spawn.
Also see Hot Bunnies, a fast DSL on top of the RabbitMQ Java client:
https://github.com/ruby-amqp/hot_bunnies
(Thanks for info from Michael Klishin on Google Groups, and stoyan on blogger)

Related

How to work around zeromq late-joiner problem in a proxy (xpub/xsub)?

I have read all salient posts and the zeromq guide on the topic of pub/sub late joiner, and hopefully I simply missed the answer, but just in case I haven't, I have two questions about the zeromq proxy in context to the late joiner:
Does the zeromq proxy with the suggested XSUB/XPUB sockets also suffer from the late joiner problem, ie. are the first few pub messages of a new publisher dropped ?
If so, what is the accepted solution to ensure that even the first published message is received by subscribers with matching topic (my latest info is to sleep...)
I don't believe it is pertinent to the question, but just in case: here are
My proxy's run method, which runs in its own thread; it starts a capture thread if DEBUG is true to log all messages; if no matching subscription exists, nothing is captured.
The proxy works fine including the capture socket. However, I am now adding functionality where a new publisher is started in a thread and will immediately start to publish messages. Hence my question, if a message is published straight away, will it be dropped ?
def run(self):
msg = debug_log(self, f"{self.name} thread running...")
debug_log(self, msg + "binding sockets...")
xpub = self.zmq_ctx.socket(zmq.XPUB)
xpub.bind(sys_conf.system__zmq_broker_xpub_addr)
xsub = self.zmq_ctx.socket(zmq.XSUB)
xsub.bind(sys_conf.system__zmq_broker_xsub_addr)
if self.loglevel == DEBUG and sys_conf.system__zmq_broker_capt_addr:
debug_log(self, msg + " done, now starting broker with message "
"capture and logging.")
capt = self.zmq_ctx.socket(zmq.PAIR)
capt.bind(sys_conf.system__zmq_broker_capt_addr)
capt_th = Thread(
name=self.name + "_capture_thread", target=self.capture_run,
args=(self.zmq_ctx,), daemon=True
)
capt_th.start()
capt.recv() # wait for peer thread to start and check in
debug_log(self, msg + "capture peer synchronised, proceeding.")
zmq.proxy(xpub, xsub, capt) # blocks until thread terminates
else:
debug_log(self, msg + " starting broker.")
zmq.proxy(xpub, xsub) # blocks until thread terminates
def capture_run(self, ctx):
""" capture socket's thread's run method.
debug_log(self, f"{self.name} capture thread running.")
sock = ctx.socket(zmq.PAIR)
sock.connect(sys_conf.system__zmq_broker_capt_addr)
sock.send(b"") # ack message to calling thread
while True:
try: # assume we're getting topic string followed by python object
topic = sock.recv_string()
obj = sock.recv_pyobj()
except Exception: # if not simply log message in bytes
topic = None
obj = sock.recv()
debug_log(self, f"{self.name} capture_run received topic {topic}, "
f"obj {obj}.")
My users of the proxy (they are all both subscribers and publishers) in different threads and/or processes from the proxy:
...
# establish zmq subscriber socket and connect to broker
self._evm_subsock = self._zmq_ctx.socket(zmq.SUB)
self.subscribe_topics()
self._evm_subsock.connect(sys_conf.system__zmq_broker_xpub_addr)
# establish pub socket and connect to broker
self._evm_pub_sock = self._zmq_ctx.socket(zmq.PUB)
self._evm_pub_sock.connect(sys_conf.system__zmq_broker_xsub_addr)
debug_log(self, msg + " Connected to pub/sub broker.")
...

How to push messages from unacked to ready

My question is similar to a question asked previously, however it does not find an answer, I have a Consumer which I want to process an action called a Web Service, however, if this web service does not respond for some reason, I want the consumer not to process the message of the RabbitMQ but I encole it to process it later, my consumer is the following one:
require File.expand_path('../config/environment.rb', __FILE__)
conn=Rabbit.connect
conn.start
ch = conn.create_channel
x = ch.exchange("d_notification_ex", :type=> "x-delayed-message", :arguments=> { "x-delayed-type" => "direct"})
q = ch.queue("d_notification_q", :durable =>true)
q.bind(x)
p 'Wait ....'
q.subscribe(:manual_ack => true, :block => true) do |delivery_info, properties, body|
datos=JSON.parse(body)
if datos['status']=='request'
#I call a web service and process the json
result=Notification.send_payment_notification(datos.to_json)
else
#I call a web service and process the body
result=Notification.send_payment_notification(body)
end
#if the call to the web service, the web server is off the result will be equal to nil
#therefore, he did not notify RabbitMQ, but he puts the message in UNACKED status
# and does not process it later, when I want him to keep it in the queue and evaluate it afterwards.
unless result.nil?
ch.ack(delivery_info.delivery_tag)
end
end
An image of RabbitMQ,
There is some way that in the statement: c hack (delivery_info.delivery_tag), this instead of deleting the element of the queue can process it later, any ideas? Thanks
The RabbitMQ team monitors this mailing list and only sometimes answers questions on StackOverflow.
Try this:
if result.nil?
ch.nack(delivery_info.delivery_tag)
else
ch.ack(delivery_info.delivery_tag)
end
I decided to send the data back to the queue with a style "producer within the consumer", my code now looks like this:
if result.eql? 'ok'
ch.ack(delivery_info.delivery_tag)
else
if(datos['count'] < 5)
datos['count'] += 1
d_time=1000
x.publish(datos.to_json, :persistent => true, :headers=>{"x-delay" => d_time})
end
end
However I was forced to include one more attribute in the JSON attribute: Count! so that it does not stay in an infinite cycle.

Bunny::AccessRefused message when trying to read messages

I'm trying to read messages from a queue using Bunny. I only have read permissions on the RabbitMQ server but it seems the code I'm using tries to create the queue - though I can see the queue already exists with queue_exists?().
There must be a process in Bunny whereby one can simply read messages off an existing queue? Here's the code I'm using
require 'bunny'
class ExampleConsumer < Bunny::Consumer
def cancelled?
#cancelled
end
def handle_cancellation(_)
#cancelled = true
end
end
conn = Bunny.new("amqp://xxx:xxx#xxx", automatic_recovery: false)
conn.start
ch = conn.channel
q = ch.queue("a_queue")
consumer = ExampleConsumer.new(ch, q)
When I execute the above I receive:
/Users/jamessmith/.rvm/gems/ruby-1.9.3-p392/gems/bunny-1.7.1/lib/bunny/channel.rb:1915:in `raise_if_continuation_resulted_in_a_channel_error!': ACCESS_REFUSED - access to queue 'a_queue' in vhost '/' refused for user 'xxx' (Bunny::AccessRefused)
in most RMQ configurations I've seen, the consumer will have permissions to create the queue that they need.
If you must have your permissions set up so that you can't create the queue from your consumer, I'd suggest opening an issue ticket with the Bunny gem. it doesn't look like that is supported

Daemon-kit process one amqp job at a time

We've used daemon-kit to create a amqp worker which should receive a job and then ask for a new job, but not before the first job is finished. The problem is that Daemon Kit forkes the job and immediately starts a new job if there is one in the RabbitMQ queue.
Is there a formal way to force one-job-at-a-time-behaviour in daemon-kit? Or how can we achieve this?
This is a short version of how we start the amqp worker and process jobs. When a job finishes with a result it publishes this back to the RabbitMQ server.
# Run an event-loop for processing
DaemonKit::AMQP.run do |connection|
connection.on_tcp_connection_loss do |client, settings|
DaemonKit.logger.debug("AMQP connection status changed: #{client.status}")
client.reconnect(false, 1)
end
amq = AMQP::Channel.new
amq.queue(engine_key).subscribe do |metadata,msg|
msg_decode = JSON.parse(msg)
job = REFxEngineRunnerAPI10.new msg_decode
result = job.run(metadata.correlation_id)
amq.queue( metadata.reply_to, :auto_delete => false)
xc = amq.default_exchange
xc.publish JSON.dump(result), :routing_key => metadata.reply_to, :correlation_id => metadata.correlation_id
end
end
UPDATE
I found this to work for us:
DaemonKit::AMQP.run do |connection|
amq = AMQP::Channel.new(connection, prefetch: 1)
# I needs this extra line because I use RabbitMQ new than version 2.3.6
amq.qos(0, 1)
# be sure to add (:ack => true)
amq.queue(engine_key).subscribe(:ack => true) do |metadata,msg|
#### run long job one at a time
# Tell RabbitMQ I finished the job and I can now receive a new job
metadata.ack
end
end
I'm taking a stab in the dark here, since this sounds to me exactly how the protocol should behave. You can however using QoS or prefetching to limit the number of messages sent down to a subscriber from the broker using something like this:
amq = AMQP::Channel.new(connection, prefetch: 1)
According to the example this should give you the behaviour your desire.

How to synchronize data between multiple workers

I've the following problem that is begging a zmq solution. I have a time-series data:
A,B,C,D,E,...
I need to perform an operation, Func, on each point.
It makes good sense to parallelize the task using multiple workers via zmq. However, what is tripping me up is how do I synchronize the result, i.e., the results should be time-ordered exactly the way the input data came in. So the end result should look like:
Func(A), Func(B), Func(C), Func(D),...
I should also point out that time to complete,say, Func(A) will be slightly different than Func(B). This may require me to block for a while.
Any suggestions would be greatly appreciated.
You will always need to block for a while in order to synchronize things. You can actually send requests to a pool of workers, and when a response is received - to buffer it if it is not a subsequent one. One simple workflow could be described in a pseudo-language as follows:
socket receiver; # zmq.PULL
socket workers; # zmq.DEALER, the worker thread socket is started as zmq.DEALER too.
poller = poller(receiver, workers);
next_id_req = incr()
out_queue = queue;
out_queue.last_id = next_id_req
buffer = sorted_queue;
sock = poller.poll()
if sock is receiver:
packet_N = receiver.recv()
# send N for processing
worker.send(packet_N, ++next_id_req)
else if sock is workers:
# get a processed response Func(N)
func_N_response, id = workers.recv()
if out_queue.last_id != id-1:
# not subsequent id, buffer it
buffer.push(id, func_N_rseponse)
else:
# in order, push to out queue
out_queue.push(id, func_N_response)
# also consume all buffered subsequent items
while (out_queue.last_id == buffer.min_id() - 1):
id, buffered_N_resp = buffer.pop()
out_queue.push(id, buffered_N_resp)
But here comes the problem what happens if a packet is lost in the processing thread(the workers pool).. You can either skip it after a certain timeout(flush the buffer into the out queue), amd continue filling the out queue, and reorder when the packet comes later, if ever comes.

Resources