Ruby - Elastic Search & RabbitMQ - data import being lost, script crashing silently - ruby

Stackers
I have a lot of messages in a RabbitMQ queue (running on localhost in my dev environment). The payload of the messages is a JSON string that I want to load directly into Elastic Search (also running on localhost for now). I wrote a quick ruby script to pull the messages from the queue and load them into ES, which is as follows :
#! /usr/bin/ruby
require 'bunny'
require 'json'
require 'elasticsearch'
# Connect to RabbitMQ to collect data
mq_conn = Bunny.new
mq_conn.start
mq_ch = mq_conn.create_channel
mq_q = mq_ch.queue("test.data")
# Connect to ElasticSearch to post the data
es = Elasticsearch::Client.new log: true
# Main loop - collect the message and stuff it into the db.
mq_q.subscribe do |delivery_info, metadata, payload|
begin
es.index index: "indexname",
type: "relationship",
body: payload
rescue
puts "Received #{payload} - #{delivery_info} - #{metadata}"
puts "Exception raised"
exit
end
end
mq_conn.close
There are around 4,000,000 messages in the queue.
When I run the script, I see a bunch of messages, say 30, being loaded into Elastic Search just fine. However, I see around 500 messages leaving the queue.
root#beep:~# rabbitmqctl list_queues
Listing queues ...
test.data 4333080
...done.
root#beep:~# rabbitmqctl list_queues
Listing queues ...
test.data 4332580
...done.
The script then silently exits without telling me an exception. The begin/rescue block never triggers an exception so I don't know why the script is finishing early or losing so many messages. Any clues how I should debug this next.
A

I've added a simple, working example here:
https://github.com/elasticsearch/elasticsearch-ruby/blob/master/examples/rabbitmq/consumer-publisher.rb
It's hard to debug your example when you don't provide examples of the test data.
The Elasticsearch "river" feature is deprecated, and will be removed, eventually. You should definitely invest time into writing your own custom feeder, if RabbitMQ and Elasticsearch are a central part of your infrastructure.

Answering my own question, I then learned that this is a crazy and stupid way to load a message queue of index instructions into Elastic. I created a river and can drain instructions much faster than I could with a ropey script. ;-)

Related

sidekiq - runaway FIFO pipes created with large job

We are using Sidekiq to process a number of backend jobs. One in particular is used very heavily. All I can really say about it is that it sends emails. It doesn't do the email creation (that's a separate job), it just sends them. We spin up a new worker for each email that needs to be sent.
We are trying to upgrade to Ruby 3 and having problems, though. Ruby 2.6.8 has no issues; in 3 (as well as 2.7.3 IIRC), if there is a large number of queued workers, it will get through maybe 20K of them, then it will start hemorrhaging FIFO pipes, on the order of 300-1000 ever 5 seconds or so. Eventually it gets to the ulimit on the system (currently set at 64K) and all sockets/connections fail due to insufficient resources.
In trying to debug this issue I did a run with 90% of what the email worker does entirely commented out, so it does basically nothing except make a couple database queries and do some string templating. I thought I was getting somewhere with that approach, as one run (of 50K+ emails) succeeded without the pipe explosion. However, the next run (identical parameters) did wind up with the runaway pipes.
Profiling with rbspy and ruby-prof did not help much, as they primarily focus on the Sidekiq infrastructure, not the workers themselves.
Looking through our code, I did see that nothing we wrote is ever using IO.* (e.g. IO.popen, IO.select, etc), so I don't see what could be causing the FIFO pipes.
I did see https://github.com/mperham/sidekiq/wiki/Batches#huge-batches, which is not necessarily what we're doing. If you look at the code snippet below, we're basically creating one large batch. I'm not sure whether pushing jobs in bulk as per the link will help with the problem we're having, but I'm about to give it a try once I rework things a bit.
No matter what I do I can't seem to figure out the following:
What is making these pipes? Why are they being created?
What is the condition by which the pipes start getting made exponentially? There are two FIFO pipes that open when we start Sidekiq, but until enough work has been done, we don't see more than 2-6 pipes open generally.
Any advice is appreciated, even along the lines of where to look next, as I'm a bit stumped.
Initializer:
require_relative 'logger'
require_relative 'configuration'
require 'sidekiq-pro'
require "sidekiq-ent"
module Proprietary
unless const_defined?(:ENVIRONMENT)
ENVIRONMENT = ENV['RACK_ENV'] || ENV['RAILS_ENV'] || 'development'
end
# Sidekiq.client_middleware.add Sidekiq::Middleware::Client::Batch
REDIS_URL = if ENV["REDIS_URL"].present?
ENV["REDIS_URL"]
else
"redis://#{ENV["REDIS_SERVER"]}:#{ENV["REDIS_PORT"]}"
end
METRICS = Statsd.new "10.0.9.215", 8125
Sidekiq::Enterprise.unique! unless Proprietary::ENVIRONMENT == "test"
Sidekiq.configure_server do |config|
# require 'sidekiq/pro/reliable_fetch'
config.average_scheduled_poll_interval = 2
config.redis = {
namespace: Proprietary.config.SIDEKIQ_NAMESPACE,
url: Proprietary::REDIS_URL
}
config.server_middleware do |chain|
require 'sidekiq/middleware/server/statsd'
chain.add Sidekiq::Middleware::Server::Statsd, :client => METRICS
end
config.error_handlers << Proc.new do |ex,ctx_hash|
Proprietary.report_exception(ex, "Sidekiq", ctx_hash)
end
config.super_fetch!
config.reliable_scheduler!
end
Sidekiq.configure_client do |config|
config.redis = {
namespace: Proprietary.config.SIDEKIQ_NAMESPACE,
url: Proprietary::REDIS_URL,
size: 15,
network_timeout: 5
}
end
end
Code snippet (sanitized)
def add_targets_to_batch
#target_count = targets.count
queue_counter = 0
batch.jobs do
targets.shuffle.each do |target|
send(campaign_target)
queue_counter += 1
end
end
end
def send(campaign_target)
TargetEmailWorker.perform_async(target[:id],
guid,
is_draft ? target[:email_address] : nil)
begin
Target.where(id: target[:id]).update(send_at: Time.now.utc)
rescue Exception => ex
Proprietary.report_exception(ex, self.class.name, { target_id: target[:id], guid: guid })
end
end
end
First I tried auditing our external connections for connection pooling, etc. That did not help the issue. Eventually I got to the point where I disabled all external connections and let the job run doing virtually nothing outside of a database query and some logging. This allowed one run to complete without issue, but on the second one, the FIFO pipes still grew exponentially after a certain (variable) amount of work was done.

Ruby mongo driver: Catch MongoDB connection errors after a few seconds?

I am performing this query with my Ruby mongo driver:
begin
User.collection.find({}).count()
rescue => e
Rails.logger.error e.to_s
end
I would like to catch all situations where this operation fails. The main reason it would fail is if the server was unavailable.
For example, one of the errors I see occasionally is:
Mongo::Error::NoServerAvailable (No server is available matching preference: #<Mongo::ServerSelector::Primary:0x70302744731080 tag_sets=[] max_staleness=nil> using server_selection_timeout=30 and local_threshold=0.015)
I want to catch errors after just 6 seconds.
From the docs I see that there are a few different timeout options (connect_timeout, server_selection_timeout, socket_timeout). But I am not sure which to pass, and how to pass them.
You're on the right track. server_selection_timeout is the correct option in this case -- it tells the driver how long to wait to find a suitable server before timing out. You can set that option on a new client in the Mongoid configuration file (config/mongoid.yml).
You want your configuration file to look something like this:
development: # (or production or whatever environment you're using)
clients:
default:
# your default client here
new_client: # the name of your new client with different options
database: mongoid
hosts:
- localhost:27017
options:
server_selection_timeout: 6
...
Read the Mongoid Configuration documentation to learn more about setting up config files.
Then, you want to use the new client you defined to perform your query, and rescue any Mongo::Error::NoServerAvailable errors.
begin
User.with(client: 'new_client').collection.find({}).count()
rescue Mongo::Error::NoServerAvailable => e
# perform error handling here
end
Note that this starts up a new Mongo::Client instance, which is an expensive operation if done repeatedly. I would recommend that you close the extra client once you are done using it, like so:
Mongoid::Clients.with_name('new_client').close

Activity cannot send a response with data larger than 32768 characters

I am trying to invoke a simple lambda function (the lambda function prints hello world to console) using ruby . However when I run the code and look at the swf dashboard . I see the following error :
Reason: An Activity cannot send a response with data larger than 32768 characters. Please limit the size of the response. You can look at the Activity Worker logs to see the original response.
Could someone help me out to resolve this issue?
the code is as follows:
require 'aws/decider'
require 'aws-sdk'
class U_Act
extend AWS::Flow::Activities
activity :b_u do
{
version: "1.0"
}
end
def b_u(c_id)
lambda=Aws::Lambda::Client.new(
region: “xxxxxx”
access_key_id: “XxXXXXXXXXX”,
secret_access_key: “XXXXXXXXXX”
)
resp = lambda.invoke(
function_name: “s_u_1” # required
)
print "#{resp}"
end
Thanks
According to AWS documentation you cannot send input / result data set size larger than 32,000 characters. This limit affects activity or workflow execution result data, input data when scheduling activity tasks or workflow executions, and input sent with a workflow execution signal.
Workaround to resolve this issue are
Use AWS S3 to upload the message and send the path of the S3 message between the activities.
If you need high performance use Elasticache and store the values and pass the keys between the activities.

How to join multiple multicast groups on one interface

The Ruby version I have available is 1.8.7 and can't be upgraded as it is part of standard image that is used on all the companies Linux servers at this time and anything I do needs to be able to run on all of these servers without issue (I'm hoping though this won't be an issue)
The project I am doing is to recreate an application that currently runs on Windows on a Linux server. The application takes a list of multicast groups and interfaces and attempts to join the groups and then listens for any data (doesn't matter what) reporting whether it could join and the data was there. It helps us in our environment prove out network connectivity prior to deployment of actual software on to the server. The data that it will be receiving will be binary encoded financial information from an exchange so I don't need to output (hence the commented out line and the output) I just need to check it is available to the server.
I have read up online and found bits and pieces of code that I have cobbled together into a small version of this where it joins 1 multicast group bound to 1 interface and listens for data for a period of time reporting whether any data was received.
I then wanted to add a second multicast group and this is where my understanding is lacking in how to achieve this. My code is as follows:
#!/usr/bin/ruby
require 'socket'
require 'ipaddr'
require 'timeout'
MCAST_GROUP_A =
{
:addr => '233.54.12.111',
:port => 26477,
:bindaddr => '172.31.230.156'
}
MCAST_GROUP_B =
{
:addr => '233.54.12.111',
:port => 18170,
:bindaddr => '172.31.230.156'
}
ipA = IPAddr.new(MCAST_GROUP_A[:addr]).hton + IPAddr.new(MCAST_GROUP_A[:bindaddr]).hton
ipB = IPAddr.new(MCAST_GROUP_B[:addr]).hton + IPAddr.new(MCAST_GROUP_B[:bindaddr]).hton
begin
sockA = UDPSocket.open
sockA.setsockopt Socket::IPPROTO_IP, Socket::IP_ADD_MEMBERSHIP, ipA
sockA.setsockopt Socket::IPPROTO_IP, Socket::IP_ADD_MEMBERSHIP, ipB
sockA.bind Socket::INADDR_ANY, MCAST_GROUP_A[:port]
sockA.bind Socket::INADDR_ANY, MCAST_GROUP_B[:port]
timeoutSeconds = 10
Timeout.timeout(timeoutSeconds) do
msg, info = sockA.recvfrom(1024)
#puts "MSG: #{msg} from #{info[2]} (#{info[3]})/#{info[1]} len #{msg.size}"
puts "MSG: <garbled> from #{info[2]} (#{info[3]})/#{info[1]} len #{msg.size}"
end
rescue Timeout::Error
puts "Nothing received connection timedout\n"
ensure
sockA.close
end
The error I get when I run this is:
[root#dt1d-ddncche21a ~]# ./UDPServer.rb
./UDPServer.rb:35:in `setsockopt': Address already in use (Errno::EADDRINUSE)
from ./UDPServer.rb:35
So that's where I am at and could really do with firstly pointers as to what is wrong (hopefully with an update to the code) and then once I this example working the next step will to be add a second interface into the mix to listen to again multiple multicast groups,
Ok so I followed the advice given to bind to the interface first for each port and then add members for each of the multicast groups I want to listen to and this has resolved this particular issue and moved me on to the next issue I have. The next issue I will raise as a new topic.

Dynamically loading new jobs in a SidekiqStatus container to monitor completion

I built a small web crawler implemented in two Sidekiq workers: Crawler and Parsing. The Crawler worker will seek for links while Parsing worker will read the page body.
I want to trigger an alert when the crawling/parsing of all pages is complete. Monitoring only the Crawler job is not the best solution since it may have finished but there might be several Parser jobs running.
Having a look at sidekiq-status gem it seems that I cannot dynamically add new jobs to the container for monitoring. E.g. it would be nice to have a "add" method in the following context:
#container = SidekiqStatus::Container.new
# ... for each page url found:
jid = ParserWorker.perform_async(page_url)
#container.add(jid)
The closest to this is to use "SidekiqStatus::Container.load" or "SidekiqStatus::Container.load_multi" however, it is not possible to add new jobs in the container a posteriori.
One solution would be to create as many SidekiqStatus::Container instances as the number of ParserJobs and check if all of them have status == "finished", but I wonder if a more elegant solution exists using these tools.
Any help is appreciated.
You are describing Sidekiq Pro's Batches feature exactly. You can spend a lot of time or some money to solve your problem.
https://github.com/mperham/sidekiq/wiki/Batches
OK, here's a simple solution. Using the sidekiq-status gem, the Crawler worker keeps track of the jobs IDs for the Parser jobs and halts if any Parser job is still busy (using the SidekiqStatus::Container instance to check job status).
def perform()
# for each page....
#jids << ParserWorker.perform_async(page_url)
# end
# crawler finished, parsers may still be running
while parsers_busy?
sleep 5 # wait 5 secs between each check
end
# all parsers complete, trigger notification...
end
def parsers_busy?
status_containers = SidekiqStatus::Container.load_multi(#jids)
for container in status_containers
if container.status == 'waiting' || container.status == 'working'
return true
end
end
return false
end

Resources