Ruby mongo driver: Catch MongoDB connection errors after a few seconds? - ruby

I am performing this query with my Ruby mongo driver:
begin
User.collection.find({}).count()
rescue => e
Rails.logger.error e.to_s
end
I would like to catch all situations where this operation fails. The main reason it would fail is if the server was unavailable.
For example, one of the errors I see occasionally is:
Mongo::Error::NoServerAvailable (No server is available matching preference: #<Mongo::ServerSelector::Primary:0x70302744731080 tag_sets=[] max_staleness=nil> using server_selection_timeout=30 and local_threshold=0.015)
I want to catch errors after just 6 seconds.
From the docs I see that there are a few different timeout options (connect_timeout, server_selection_timeout, socket_timeout). But I am not sure which to pass, and how to pass them.

You're on the right track. server_selection_timeout is the correct option in this case -- it tells the driver how long to wait to find a suitable server before timing out. You can set that option on a new client in the Mongoid configuration file (config/mongoid.yml).
You want your configuration file to look something like this:
development: # (or production or whatever environment you're using)
clients:
default:
# your default client here
new_client: # the name of your new client with different options
database: mongoid
hosts:
- localhost:27017
options:
server_selection_timeout: 6
...
Read the Mongoid Configuration documentation to learn more about setting up config files.
Then, you want to use the new client you defined to perform your query, and rescue any Mongo::Error::NoServerAvailable errors.
begin
User.with(client: 'new_client').collection.find({}).count()
rescue Mongo::Error::NoServerAvailable => e
# perform error handling here
end
Note that this starts up a new Mongo::Client instance, which is an expensive operation if done repeatedly. I would recommend that you close the extra client once you are done using it, like so:
Mongoid::Clients.with_name('new_client').close

Related

Aborting queries on neo4jrb

I am running something along the lines of the following:
results = queries.map do |query|
begin
Neo4j::Session.query(query)
rescue Faraday::TimeoutError
nil
end
end
After a few iterations I get an unrescued Faraday::TimeoutError: too many connection resets (due to Net::ReadTimeout - Net::ReadTimeout) and Neo4j needs switching off and on again.
I believe this is because the queries themselves aren't aborted - i.e. the connection times out but Neo4j carries on trying to run my query. I actually want to time them out, so simply increasing the timeout window won't help me.
I've had a scout around and it looks like I can find my queries and abort them via the Neo4j API, which will be my next move.
Am I right in my diagnosis? If so, is there a recommended way of managing queries (and aborting them) from neo4jrb?
Rebecca is right about managing queries manually. Though if you want Neo4j to automatically stop queries within a certain time period, you can set this in your neo4j conf:
dbms.transaction.timeout=60s
You can find more info in the docs for that setting.
The Ruby gem is using Faraday to connect to Neo4j via HTTP and Faraday has a built-in timeout which is separate from the one in Neo4j. I would suggest setting the Neo4j timeout as a bit longer (5-10 seconds perhaps) than the one in Ruby (here are the docs for configuring the Faraday timeout). If they both have the same timeout, Neo4j might raise a timeout before Ruby, making for a less clear error.
Query management can be done through Cypher. You must be an admin user.
To list all queries, you can use CALL dbms.listQueries;.
To kill a query, you can use CALL dbms.killQuery('ID-OF-QUERY-TO-KILL');, where the ID is obtained from the list of queries.
The previous statements must be executed as a raw query; it does not matter whether you are using an OGM, as long as you can input queries manually. If there is no way to manually input queries, and there is no way of doing this in your framework, then you will have to access the database using some other method in order to execute the queries.
So thanks to Brian and Rebecca for useful tips about query management within Neo4j. Both of these point the way to viable solutions to my problem, and Brian's explicitly lays out steps for achieving one via Neo4jrb so I've marked it correct.
As both answers assume, the diagnosis I made IS correct - i.e. if you run a query from Neo4jrb and the HTTP connection times out, Neo4j will carry on executing the query and Neo4jrb will not issue any instruction for it to stop.
Neo4jrb does not provide a wrapper for any query management functionality, so simply setting a transaction timeout seems most sensible and probably what I'll adopt. Actually intercepting and killing queries is also possible, but this means running your query on one thread so that you can look up its queryId in another. This is the somewhat hacky solution I'm working with atm:
class QueryRunner
DEFAULT_TIMEOUT=70
def self.query(query, timeout_limit=DEFAULT_TIMEOUT)
new(query, timeout_limit).run
end
def initialize(query, timeout_limit)
#query = query
#timeout_limit = timeout_limit
end
def run
start_time = Time.now.to_i
Thread.new { #result = Neo4j::Session.query(#query) }
sleep 0.5
return #result if #result
id = if query_ref = Neo4j::Session.query("CALL dbms.listQueries;").to_a.find {|x| x.query == #query }
query_ref.queryId
end
while #result.nil?
if (Time.now.to_i - start_time) > #timeout_limit
puts "killing query #{id} due to timeout"
Neo4j::Session.query("CALL dbms.killQuery('#{id}');")
#result = []
else
sleep 1
end
end
#result
end
end

Sequel + ADO + Puma is not threading queries

We have a website running in Windows Server 2008 + SQLServer 2008 + Ruby + Sinatra + Sequel/Puma
We've developed an API for our website.
When the access points are requested by many clients, at the same time, the clients start getting RequestTimeout exceptions.
I investigated a bit, and I noted that Puma is managing multi threading fine.
But Sequel (or any layer below Sequel) is processing one query at time, even if they came from different clients.
In fact, the RequestTimeout exceptions don't occur if I launch many web servers, each one listening one different port, and I assign one different port to each client.
I don't know yet if the problem is Sequel, ADO, ODBC, Windows, SQLServer or what.
The truth is that I cannot switch to any other technology (like TinyTDS)
Bellow is a little piece of code with screenshots that you can use to replicate the bug:
require 'sinatra'
require 'sequel'
CONNECTION_STRING =
"Driver={SQL Server};Server=.\\SQLEXPRESS;" +
"Trusted_Connection=no;" +
"Database=pulqui;Uid=;Pwd=;"
DB = Sequel.ado(:conn_string=>CONNECTION_STRING)
enable :sessions
configure { set :server, :puma }
set :public_folder, './public/'
set :bind, '0.0.0.0'
get '/delaybyquery.json' do
tid = params[:tid].to_s
begin
puts "(track-id=#{tid}).starting access point"
q = "select p1.* from liprofile p1, liprofile p2, liprofile p3, liprofile p4, liprofile p5"
DB[q].each { |row| # this query should takes a lot of time
puts row[:id]
}
puts "(track-id=#{tid}).done!"
rescue=>e
puts "(track-id=#{tid}).error:#{e.to_s}"
end
end
get '/delaybycode.json' do
tid = params[:tid].to_s
begin
puts "(track-id=#{tid}).starting access point"
sleep(30)
puts "(track-id=#{tid}).done!"
rescue=>e
puts "(track-id=#{tid}).error:#{e.to_s}"
end
end
There are 2 access points in the code above:
delaybyquery.json, that generates a delay by joining the same table 5
times. Note that the table must be about 1000 rows in order to get the
query working really slow; and
delaybycode.json, that generates a delay by just calling the ruby sleep
function.
Both access points receives a tid (tracking-id) parameter, and both write the
outout in the CMD, so you can follow the activity of both process in the same
window and check which access point is blocking incoming requests from other
browsers.
For testing I'm opening 2 tabs in the same chrome browser.
Below are the 2 testings that I'm performing.
Step #1: Run the webserver
c:\source\pulqui>ruby example.app.rb -p 81
I get the output below
Step #2: Testing Delay by Code
I called to this URL:
127.0.0.1:81/delaybycode.json?tid=123
and 5 seconds later I called this other URL
127.0.0.1:81/delaybycode.json?tid=456
Below is the output, where you can see that both calls are working in parallel
click here to see the screenshot
Step #3: Testing Delay by Query
I called to this URL:
127.0.0.1:81/delaybyquery.json?tid=123
and 5 seconds later I called this other URL
127.0.0.1:81/delaybyquery.json?tid=456
Below is the output, where you can see that calls are working 1 at time.
Each call to an access point is finishing with a query timeout exception.
click here to see the screenshot
This is almost assuredly due to win32ole (the driver that Sequel's ado adapter uses). It probably doesn't release the GVL during queries, which would cause the issues you are seeing.
If you cannot switch to TinyTDS or switch to JRuby, then your only option if you want concurrent queries is to run separate webserver processes, and have a reverse proxy server dispatch requests to them.

How to join multiple multicast groups on one interface

The Ruby version I have available is 1.8.7 and can't be upgraded as it is part of standard image that is used on all the companies Linux servers at this time and anything I do needs to be able to run on all of these servers without issue (I'm hoping though this won't be an issue)
The project I am doing is to recreate an application that currently runs on Windows on a Linux server. The application takes a list of multicast groups and interfaces and attempts to join the groups and then listens for any data (doesn't matter what) reporting whether it could join and the data was there. It helps us in our environment prove out network connectivity prior to deployment of actual software on to the server. The data that it will be receiving will be binary encoded financial information from an exchange so I don't need to output (hence the commented out line and the output) I just need to check it is available to the server.
I have read up online and found bits and pieces of code that I have cobbled together into a small version of this where it joins 1 multicast group bound to 1 interface and listens for data for a period of time reporting whether any data was received.
I then wanted to add a second multicast group and this is where my understanding is lacking in how to achieve this. My code is as follows:
#!/usr/bin/ruby
require 'socket'
require 'ipaddr'
require 'timeout'
MCAST_GROUP_A =
{
:addr => '233.54.12.111',
:port => 26477,
:bindaddr => '172.31.230.156'
}
MCAST_GROUP_B =
{
:addr => '233.54.12.111',
:port => 18170,
:bindaddr => '172.31.230.156'
}
ipA = IPAddr.new(MCAST_GROUP_A[:addr]).hton + IPAddr.new(MCAST_GROUP_A[:bindaddr]).hton
ipB = IPAddr.new(MCAST_GROUP_B[:addr]).hton + IPAddr.new(MCAST_GROUP_B[:bindaddr]).hton
begin
sockA = UDPSocket.open
sockA.setsockopt Socket::IPPROTO_IP, Socket::IP_ADD_MEMBERSHIP, ipA
sockA.setsockopt Socket::IPPROTO_IP, Socket::IP_ADD_MEMBERSHIP, ipB
sockA.bind Socket::INADDR_ANY, MCAST_GROUP_A[:port]
sockA.bind Socket::INADDR_ANY, MCAST_GROUP_B[:port]
timeoutSeconds = 10
Timeout.timeout(timeoutSeconds) do
msg, info = sockA.recvfrom(1024)
#puts "MSG: #{msg} from #{info[2]} (#{info[3]})/#{info[1]} len #{msg.size}"
puts "MSG: <garbled> from #{info[2]} (#{info[3]})/#{info[1]} len #{msg.size}"
end
rescue Timeout::Error
puts "Nothing received connection timedout\n"
ensure
sockA.close
end
The error I get when I run this is:
[root#dt1d-ddncche21a ~]# ./UDPServer.rb
./UDPServer.rb:35:in `setsockopt': Address already in use (Errno::EADDRINUSE)
from ./UDPServer.rb:35
So that's where I am at and could really do with firstly pointers as to what is wrong (hopefully with an update to the code) and then once I this example working the next step will to be add a second interface into the mix to listen to again multiple multicast groups,
Ok so I followed the advice given to bind to the interface first for each port and then add members for each of the multicast groups I want to listen to and this has resolved this particular issue and moved me on to the next issue I have. The next issue I will raise as a new topic.

Ruby - Elastic Search & RabbitMQ - data import being lost, script crashing silently

Stackers
I have a lot of messages in a RabbitMQ queue (running on localhost in my dev environment). The payload of the messages is a JSON string that I want to load directly into Elastic Search (also running on localhost for now). I wrote a quick ruby script to pull the messages from the queue and load them into ES, which is as follows :
#! /usr/bin/ruby
require 'bunny'
require 'json'
require 'elasticsearch'
# Connect to RabbitMQ to collect data
mq_conn = Bunny.new
mq_conn.start
mq_ch = mq_conn.create_channel
mq_q = mq_ch.queue("test.data")
# Connect to ElasticSearch to post the data
es = Elasticsearch::Client.new log: true
# Main loop - collect the message and stuff it into the db.
mq_q.subscribe do |delivery_info, metadata, payload|
begin
es.index index: "indexname",
type: "relationship",
body: payload
rescue
puts "Received #{payload} - #{delivery_info} - #{metadata}"
puts "Exception raised"
exit
end
end
mq_conn.close
There are around 4,000,000 messages in the queue.
When I run the script, I see a bunch of messages, say 30, being loaded into Elastic Search just fine. However, I see around 500 messages leaving the queue.
root#beep:~# rabbitmqctl list_queues
Listing queues ...
test.data 4333080
...done.
root#beep:~# rabbitmqctl list_queues
Listing queues ...
test.data 4332580
...done.
The script then silently exits without telling me an exception. The begin/rescue block never triggers an exception so I don't know why the script is finishing early or losing so many messages. Any clues how I should debug this next.
A
I've added a simple, working example here:
https://github.com/elasticsearch/elasticsearch-ruby/blob/master/examples/rabbitmq/consumer-publisher.rb
It's hard to debug your example when you don't provide examples of the test data.
The Elasticsearch "river" feature is deprecated, and will be removed, eventually. You should definitely invest time into writing your own custom feeder, if RabbitMQ and Elasticsearch are a central part of your infrastructure.
Answering my own question, I then learned that this is a crazy and stupid way to load a message queue of index instructions into Elastic. I created a river and can drain instructions much faster than I could with a ropey script. ;-)

Multiple roles with attributes(?) in Capistrano

How can I pass along attributes to my tasks in capistrano?
My goal is to deploy to multiple servers in a load balancer. I'd like to take each one out, deploy, and add it back in sequentially so no more than one server is down at any time.
I'm thinking it would be something along the lines of... (and the hosts array would be generated dynamically after querying my load balancer)...
role :app,
[["server_one", {:instanceId => "alice"}],
["server_two", {:instanceId => "bob"}],
["server_three", {:instanceId => "charles"}]]
And then for my tasks...
before :deploy, :deregister_instance_from_lb
after :deploy, :register_instance_with_lb
task deregister_instance_from_lb
#TODO - Deregister #{instanceId} from load balancer
end
task register_instance_with_lb
#TODO - Register #{instanceId} with load balancer
end
Any ideas?
I use this to restart my servers in series, instead of in parallel.
task :my_task, :roles => :web do
find_servers_for_task(current_task).each do |server|
run "[task command here]", :hosts => server.host
end
end
Justin, I'm sorry that's not possible, once the stream pool is opened (first run on a server set) there's no way to access server properties. (as the run code isn't run per-server, but against all-matching in the pool). Some people have had some success with doing something like this, but really it's a symptom that your scripts need too much information that you should be able to extract from your production environment.
As in this case it seems you are doing something like using the host's name to pass to a script, use what Unix gives you:
run "./my_script.rb `hostname`"
WIll that work?
References:
• http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_03_04.html (Section 3.4.5)
• http://unixhelp.ed.ac.uk/CGI/man-cgi?hostname (or $ man (1) hostname)
No one knows? I found something about the sequential block below, but thats as far as I got...
find_servers.each do |server|
#TODO - remove from load balancer
#TODO - deploy
#TODO - add back to load balancer
end
I find it hard to believe that no one has ever needed to do sequential tasks with cap.

Resources