Ruby-kafka Read all messages topic and exit - ruby

I need to read all messages form Kafka topic then process and exit (no need to run like a daemon forever) . I have written a code like below , it serves the purpose if messages available in topic , if the topic is empty ( or no new message for mentioned Group_id) it will wait till next message arrives , I need to exit immediately if no message available to process. Please have look on my code and suggest if any better way to achieve this .
I am using ruby-kafka 1.3.0 gem
require 'kafka'
khost = 'xxx.xxx.xxx.xxx'
kport = 'xxxx'
kafka = Kafka.new(["#{khost}:#{kport}"] )
consumer = kafka.consumer(group_id: "my-consumer")
consumer.subscribe("my-topic")
consumer.each_batch do |batch|
$msg = batch
consumer.stop # stop after reading first batch
end
# Process messages here
$msg.messages.each do |message|
puts message.value
end
I have also found a method kafka.fetch_messages , However I did not find an option to maintain group_id and track already processed messages without adding additional code .

Related

Ruby Bunny - Consuming from Multiple Queues

I’ve just started using Ruby and am writing a piece to consume some messages from a RabbitMQ queue. I’m using Bunny to do so.
So I’ve created my queues and binded them to an exchange.
However I’m now unsure how I handle subscribing to them both and allowing the ruby app to continue running (want the messages to keep coming through i.e. not blocked or at least not for a long time) until I actually exit it with ctrl+c.
I’ve tried using :block => true however as I have 2 different queues I’m subscribing to, using this means it remains consuming from only one.
So this is how I’m consuming messages:
def consumer
begin
puts ' [*] Waiting for messages. To exit press CTRL+C'
#oneQueue.subscribe(:manual_ack => true) do |delivery_info, properties, payload|
puts('Got One Queue')
puts "Received #{payload}, message properties are #{properties.inspect}"
end
#twoQueue.subscribe(:manual_ack => true) do |delivery_info, properties, payload|
puts('Got Two Queue')
puts "Received #{payload}, message properties are #{properties.inspect}"
end
rescue Interrupt => _
#TODO - close connections here
exit(0)
end
end
Any help would be appreciated.
Thanks!
You can't use block: true when you have two subscriptions as only the first one will block; it'll never get to the second subscription.
One thing you can do is set up both subscriptions without blocking (which will automatically spawn two threads to process messages), and then block your main thread with a wait loop (add just before your rescue):
loop { sleep 5 }

AMQP gem specifying a dead letter exchange

I've specified a queue on the RabbitMQ server called MyQueue. It is durable and has x-dead-letter-exchange set to MyQueue.DLX.
(I also have an exchange called MyExchange bound to that queue, and another exchange called MyQueue.DLX, but I don't believe this is important to the question)
If I use ruby's amqp gem to subscribe to those messages I would do it like this:
# Doing this before and in a new thread has to do with how my code is structured
# shown here in case it has a bearing on the question
Thread.new do
AMQP.start('amqp://guest:guest#127.0.0.1:5672')
end
EventMachine.next_tick do
channel = AMQP::Channel.new(AMQP.connection)
queue = channel.queue("MyQueue", :durable => true, :'x-dead-letter-exchange' => "MyQueue.DLX")
queue.subscribe(:ack => true) do |metadata, payload|
p metadata
p payload
end
end
If I execute this code with the queues and exchanges already created and bound (as they need to be in my set up) then RabbitMQ throws the following error in its logs:
=ERROR REPORT==== 19-Aug-2013::14:25:53 ===
connection <0.19654.2>, channel 2 - soft error:
{amqp_error,precondition_failed,
"inequivalent arg 'x-dead-letter-exchange'for queue 'MyQueue' in vhost '/': received none but current is the value 'MyQueue.DLX' of type 'longstr'",
'queue.declare'}
Which seems to be saying that I haven't specified the same Dead Letter Exchange as the pre-existing queue - but I believe I have with the queue = ... line.
Any ideas?
The DLX info is passed in the arguments option:
queue = channel.queue("MyQueue", {durable: true, arguments: {"x-dead-letter-exchange" => "MyQueue.DLX"}})
I had the same error, even though using #Karl Wilbur s format for the options.
Looks like your "MyQueue" already exists on the RabbitMQ server (durable: true) and it exists without a dead letter exchange configuration.
queue = channel.queue("MyQueue", :durable => true, :'x-dead-letter-exchange' => "MyQueue.DLX")
this will not create a new queue if one already exists by the name "MyQueue". Instead it will try to connect to the existing one, but the options/arguments etc have to be the same or you get an error like the one you got.
All you have to do is delete the old one and run your code again (with Karl's suggestion).
I used the RabbitMQ management GUI to delete mine. see here re deleting queues

Programmatic access to the Resque failed-job queue

How can I write code to go through the Resque failure queue and selectively delete jobs? Right now I've got a handful of important failures there, interspersed between thousands of failures from a runaway job that ran repeatedly. I want to delete the ones generated by the runaway job. The only API I'm familiar with is for enqueuing jobs. (I'll continue RTFMing, but I'm in a bit of a hurry.)
I neded up doing it like this:
# loop over all failure indices, instantiating as needed
(Resque::Failure.count-1).downto(0).each do |error_index_number|
failure = Resque::Failure.all(error_index_number)
# here :failure is the hash that has all the data about the failed job, perform any check you need here
if failure["error"][/regex_identifying_runaway_job/].present?
Resque::Failure.remove(error_index_number)
# or
# Resque::Failure.requeue(error_index_number)
end
As #Winfield mentioned, having a look at Resque's failure backend is useful.
You can manually modify the Failure queue the way you're asking, but it might be better to write a custom Failure handler that delete/re-enqueues jobs as they fail.
You can find the base failure backend here and an implementation that logs failed jobs to the Hoptoad exception tracking service here.
For example:
module Resque
module Failure
class RemoveRunaways < Base
def save
i=0
while job = Resque::Failure.all(i)
# Selectively remove all MyRunawayJobs from failure queue whenever they fail
if job.fetch('payload').fetch('class') == 'MyRunawayJob'
remove(i)
else
i = i + 1
end
end
end
end
end
end
EDIT: Forgot to mention how to specify this backend to handle Failures.
In your Resque initializer (eg: config/initializers/resque.rb):
# Use Resque Multi failure handler: standard handler and your custom handler
Resque::Failure::Multiple.classes = [Resque::Failure::Redis, Resque::Failure::RemoveRunaways]
Resque::Failure.backend = Resque::Failure::Multiple
Remove with bool function example
I used a higher order function approach, that evaluates a failure to remove
def remove_failures(should_remove_failure_func)
(Resque::Failure.count-1).downto(0).each do |i|
failure = Resque::Failure.all(i)
Resque::Failure.remove(i) if should_remove_failure_func.call(failure)
end
end
def remove_failed_validation_jobs
has_failed_for_validation_reason = -> (failure) do
failure["error"] == "Validation failed: Example has already been taken"
end
remove_failures(has_failed_for_validation_reason)
end

Script file to retrieve MQ messages from the queue manager

I want to write a script file that will append the arrived MQ messages in the Queue Manager in a log file.Please help
If you want all messages arriving on a channel, you can use the LogIP exit from the BlockIP2 page of mrmq.dk. An API exit such as SupportPac MA0W can log all messages put. An API exit can catche messages from local applications as well as those arriving over channels.
If you want to script this, you can use a program such as Q (from SupportPac MA01) to remove the messages from the queue as they arrive and append them to a file.
For example,
#!/usr/bin/ksh
q -IMYQMGR/MY.QUEUE >> logfile.txt
Typically, the script is triggered and configured to append new messages to the file. The problem with this is that it destructively removes the messages. If there is an application of record needing to use those messages it isn't a great solution. You could browse the queue but there's no guarantee of getting the messages before the app of record gets them - and the browse would periodically restart at the head of the queue so you might log the same message twice.
Another scripting option is the Perl MQSeries module. This module exposes all the options of the WMQ API as well as object-oriented methods. If you need something quick and dirty, the Q program is delivered as an executable. If you want something powerful that exposes all the APIs to your script (and don't mind compiling it) the Perl MQSeries module is a great way to go. Here's a code snippet, taken from the module's samples, showing how to GET messages:
while (1) {
$sync_flag = 0;
undef $outcome;
my $request_msg = MQSeries::Message::->new();
my $status = $request_queue->
Get('Message' => $request_msg,
'GetMsgOpts' =>
{
'WaitInterval' => 5000, # 5 seconds
'Options' => (MQSeries::MQGMO_WAIT |
MQSeries::MQGMO_SYNCPOINT_IF_PERSISTENT |
MQSeries::MQGMO_CONVERT |
MQSeries::MQGMO_FAIL_IF_QUIESCING),
},
);
unless ($status) { # Error
my $rc = $request_queue->Reason();
die "Error on 'Get' from queue $qmgr_name/$request_qname:\n" .
"\tReason: $rc (" . MQReasonToText($rc). ")\n";
}
next if ($status < 0); # No message available
One thing people have done in the past is to convert the queue to an alias over a topic. The app that uses the messages is redirected to GET from a new queue and an administrative subscription connects the topic to the new queue. At this point the real app gets all the messages and a new subscription can be made for logging messages going through the topic.

Posting large number of messages to AMQP queue

Using v0.7.1 of the Ruby amqp library and Ruby 1.8.7, I am trying to post a large number (millions) of short (~40 bytes) messages to a RabbitMQ server. My program's main loop (well, not really a loop, but still) looks like this:
AMQP.start(:host => '1.2.3.4',
:username => 'foo',
:password => 'bar') do |connection|
channel = AMQP::Channel.new(connection)
exchange = channel.topic("foobar", {:durable => true})
i = 0
EM.add_periodic_timer(1) do
print "\rPublished #{i} commits"
end
results = get_results # <- Returns an array
processor = proc do
if x = results.shift then
exchange.publish(x, :persistent => true,
:routing_key => "test.#{i}")
i += 1
EM.next_tick processor
end
end
EM.next_tick(processor)
AMQP.stop {EM.stop} end
The code starts processing the results array just fine, but after a while (usually, after 12k messages or so) it dies with the following error
/Library/Ruby/Gems/1.8/gems/amqp-0.7.1/lib/amqp/channel.rb:807:in `send':
The channel 1 was closed, you can't use it anymore! (AMQP::ChannelClosedError)
No messages are stored on the queue. The error seems to be happening just when network activity from the program to the queue server starts.
What am I doing wrong?
First mistake is that you didn't post the RabbitMQ version that you are using. Lots of people are running old obsolete version 1.7.2 because that is what is in their OS package repositories. Bad move for anyone sending the volume of messages that you are. Get RabbitMQ 2.5.1 from the RabbitMQ site itself and get rid of your default system package.
Second mistake is that you did not tell us what is in the RabbitMQ logs.
Third mistake is that you said nothing about what is consuming the messages. Is there another process running somewhere that has declared a queue and bound it to the exchange. There is NO message queue unless somebody declares it to RabbitMQ and binds it to an exchange. Even then messages will only flow if the binding key for the queue matches the routing key that you publish with.
Fourth mistake. You have routing keys and binding keys mixed up. The routing key is a string such as topic.test.json.echos and the binding key (used to bind a queue to an exchange) is a pattern like topic.# or topic..json.
Updated after your clarifications
Regarding versions, I'm not sure when it was fixed but there was a problem in 1.7.2 with large numbers of persistent messages causing RabbitMQ to crash when it rolled over its persistence log, and after crashing it was unable to restart until someone manually undid the rollover.
When you say that a connection is being opened and closed, I hope that it is not per message. That would be a strange way to use AMQP.
Let me repeat. Producers do NOT write messages to queues. They write messages to exchanges which then route the messages to queues based on the routing key (string) and the queue's binding key (pattern). In your example I misread the use of the # sign, but I see nothing which declares a queue and binds it to the exchange.

Resources