How to join multiple multicast groups on one interface - ruby

The Ruby version I have available is 1.8.7 and can't be upgraded as it is part of standard image that is used on all the companies Linux servers at this time and anything I do needs to be able to run on all of these servers without issue (I'm hoping though this won't be an issue)
The project I am doing is to recreate an application that currently runs on Windows on a Linux server. The application takes a list of multicast groups and interfaces and attempts to join the groups and then listens for any data (doesn't matter what) reporting whether it could join and the data was there. It helps us in our environment prove out network connectivity prior to deployment of actual software on to the server. The data that it will be receiving will be binary encoded financial information from an exchange so I don't need to output (hence the commented out line and the output) I just need to check it is available to the server.
I have read up online and found bits and pieces of code that I have cobbled together into a small version of this where it joins 1 multicast group bound to 1 interface and listens for data for a period of time reporting whether any data was received.
I then wanted to add a second multicast group and this is where my understanding is lacking in how to achieve this. My code is as follows:
#!/usr/bin/ruby
require 'socket'
require 'ipaddr'
require 'timeout'
MCAST_GROUP_A =
{
:addr => '233.54.12.111',
:port => 26477,
:bindaddr => '172.31.230.156'
}
MCAST_GROUP_B =
{
:addr => '233.54.12.111',
:port => 18170,
:bindaddr => '172.31.230.156'
}
ipA = IPAddr.new(MCAST_GROUP_A[:addr]).hton + IPAddr.new(MCAST_GROUP_A[:bindaddr]).hton
ipB = IPAddr.new(MCAST_GROUP_B[:addr]).hton + IPAddr.new(MCAST_GROUP_B[:bindaddr]).hton
begin
sockA = UDPSocket.open
sockA.setsockopt Socket::IPPROTO_IP, Socket::IP_ADD_MEMBERSHIP, ipA
sockA.setsockopt Socket::IPPROTO_IP, Socket::IP_ADD_MEMBERSHIP, ipB
sockA.bind Socket::INADDR_ANY, MCAST_GROUP_A[:port]
sockA.bind Socket::INADDR_ANY, MCAST_GROUP_B[:port]
timeoutSeconds = 10
Timeout.timeout(timeoutSeconds) do
msg, info = sockA.recvfrom(1024)
#puts "MSG: #{msg} from #{info[2]} (#{info[3]})/#{info[1]} len #{msg.size}"
puts "MSG: <garbled> from #{info[2]} (#{info[3]})/#{info[1]} len #{msg.size}"
end
rescue Timeout::Error
puts "Nothing received connection timedout\n"
ensure
sockA.close
end
The error I get when I run this is:
[root#dt1d-ddncche21a ~]# ./UDPServer.rb
./UDPServer.rb:35:in `setsockopt': Address already in use (Errno::EADDRINUSE)
from ./UDPServer.rb:35
So that's where I am at and could really do with firstly pointers as to what is wrong (hopefully with an update to the code) and then once I this example working the next step will to be add a second interface into the mix to listen to again multiple multicast groups,

Ok so I followed the advice given to bind to the interface first for each port and then add members for each of the multicast groups I want to listen to and this has resolved this particular issue and moved me on to the next issue I have. The next issue I will raise as a new topic.

Related

sidekiq - runaway FIFO pipes created with large job

We are using Sidekiq to process a number of backend jobs. One in particular is used very heavily. All I can really say about it is that it sends emails. It doesn't do the email creation (that's a separate job), it just sends them. We spin up a new worker for each email that needs to be sent.
We are trying to upgrade to Ruby 3 and having problems, though. Ruby 2.6.8 has no issues; in 3 (as well as 2.7.3 IIRC), if there is a large number of queued workers, it will get through maybe 20K of them, then it will start hemorrhaging FIFO pipes, on the order of 300-1000 ever 5 seconds or so. Eventually it gets to the ulimit on the system (currently set at 64K) and all sockets/connections fail due to insufficient resources.
In trying to debug this issue I did a run with 90% of what the email worker does entirely commented out, so it does basically nothing except make a couple database queries and do some string templating. I thought I was getting somewhere with that approach, as one run (of 50K+ emails) succeeded without the pipe explosion. However, the next run (identical parameters) did wind up with the runaway pipes.
Profiling with rbspy and ruby-prof did not help much, as they primarily focus on the Sidekiq infrastructure, not the workers themselves.
Looking through our code, I did see that nothing we wrote is ever using IO.* (e.g. IO.popen, IO.select, etc), so I don't see what could be causing the FIFO pipes.
I did see https://github.com/mperham/sidekiq/wiki/Batches#huge-batches, which is not necessarily what we're doing. If you look at the code snippet below, we're basically creating one large batch. I'm not sure whether pushing jobs in bulk as per the link will help with the problem we're having, but I'm about to give it a try once I rework things a bit.
No matter what I do I can't seem to figure out the following:
What is making these pipes? Why are they being created?
What is the condition by which the pipes start getting made exponentially? There are two FIFO pipes that open when we start Sidekiq, but until enough work has been done, we don't see more than 2-6 pipes open generally.
Any advice is appreciated, even along the lines of where to look next, as I'm a bit stumped.
Initializer:
require_relative 'logger'
require_relative 'configuration'
require 'sidekiq-pro'
require "sidekiq-ent"
module Proprietary
unless const_defined?(:ENVIRONMENT)
ENVIRONMENT = ENV['RACK_ENV'] || ENV['RAILS_ENV'] || 'development'
end
# Sidekiq.client_middleware.add Sidekiq::Middleware::Client::Batch
REDIS_URL = if ENV["REDIS_URL"].present?
ENV["REDIS_URL"]
else
"redis://#{ENV["REDIS_SERVER"]}:#{ENV["REDIS_PORT"]}"
end
METRICS = Statsd.new "10.0.9.215", 8125
Sidekiq::Enterprise.unique! unless Proprietary::ENVIRONMENT == "test"
Sidekiq.configure_server do |config|
# require 'sidekiq/pro/reliable_fetch'
config.average_scheduled_poll_interval = 2
config.redis = {
namespace: Proprietary.config.SIDEKIQ_NAMESPACE,
url: Proprietary::REDIS_URL
}
config.server_middleware do |chain|
require 'sidekiq/middleware/server/statsd'
chain.add Sidekiq::Middleware::Server::Statsd, :client => METRICS
end
config.error_handlers << Proc.new do |ex,ctx_hash|
Proprietary.report_exception(ex, "Sidekiq", ctx_hash)
end
config.super_fetch!
config.reliable_scheduler!
end
Sidekiq.configure_client do |config|
config.redis = {
namespace: Proprietary.config.SIDEKIQ_NAMESPACE,
url: Proprietary::REDIS_URL,
size: 15,
network_timeout: 5
}
end
end
Code snippet (sanitized)
def add_targets_to_batch
#target_count = targets.count
queue_counter = 0
batch.jobs do
targets.shuffle.each do |target|
send(campaign_target)
queue_counter += 1
end
end
end
def send(campaign_target)
TargetEmailWorker.perform_async(target[:id],
guid,
is_draft ? target[:email_address] : nil)
begin
Target.where(id: target[:id]).update(send_at: Time.now.utc)
rescue Exception => ex
Proprietary.report_exception(ex, self.class.name, { target_id: target[:id], guid: guid })
end
end
end
First I tried auditing our external connections for connection pooling, etc. That did not help the issue. Eventually I got to the point where I disabled all external connections and let the job run doing virtually nothing outside of a database query and some logging. This allowed one run to complete without issue, but on the second one, the FIFO pipes still grew exponentially after a certain (variable) amount of work was done.

Sequel + ADO + Puma is not threading queries

We have a website running in Windows Server 2008 + SQLServer 2008 + Ruby + Sinatra + Sequel/Puma
We've developed an API for our website.
When the access points are requested by many clients, at the same time, the clients start getting RequestTimeout exceptions.
I investigated a bit, and I noted that Puma is managing multi threading fine.
But Sequel (or any layer below Sequel) is processing one query at time, even if they came from different clients.
In fact, the RequestTimeout exceptions don't occur if I launch many web servers, each one listening one different port, and I assign one different port to each client.
I don't know yet if the problem is Sequel, ADO, ODBC, Windows, SQLServer or what.
The truth is that I cannot switch to any other technology (like TinyTDS)
Bellow is a little piece of code with screenshots that you can use to replicate the bug:
require 'sinatra'
require 'sequel'
CONNECTION_STRING =
"Driver={SQL Server};Server=.\\SQLEXPRESS;" +
"Trusted_Connection=no;" +
"Database=pulqui;Uid=;Pwd=;"
DB = Sequel.ado(:conn_string=>CONNECTION_STRING)
enable :sessions
configure { set :server, :puma }
set :public_folder, './public/'
set :bind, '0.0.0.0'
get '/delaybyquery.json' do
tid = params[:tid].to_s
begin
puts "(track-id=#{tid}).starting access point"
q = "select p1.* from liprofile p1, liprofile p2, liprofile p3, liprofile p4, liprofile p5"
DB[q].each { |row| # this query should takes a lot of time
puts row[:id]
}
puts "(track-id=#{tid}).done!"
rescue=>e
puts "(track-id=#{tid}).error:#{e.to_s}"
end
end
get '/delaybycode.json' do
tid = params[:tid].to_s
begin
puts "(track-id=#{tid}).starting access point"
sleep(30)
puts "(track-id=#{tid}).done!"
rescue=>e
puts "(track-id=#{tid}).error:#{e.to_s}"
end
end
There are 2 access points in the code above:
delaybyquery.json, that generates a delay by joining the same table 5
times. Note that the table must be about 1000 rows in order to get the
query working really slow; and
delaybycode.json, that generates a delay by just calling the ruby sleep
function.
Both access points receives a tid (tracking-id) parameter, and both write the
outout in the CMD, so you can follow the activity of both process in the same
window and check which access point is blocking incoming requests from other
browsers.
For testing I'm opening 2 tabs in the same chrome browser.
Below are the 2 testings that I'm performing.
Step #1: Run the webserver
c:\source\pulqui>ruby example.app.rb -p 81
I get the output below
Step #2: Testing Delay by Code
I called to this URL:
127.0.0.1:81/delaybycode.json?tid=123
and 5 seconds later I called this other URL
127.0.0.1:81/delaybycode.json?tid=456
Below is the output, where you can see that both calls are working in parallel
click here to see the screenshot
Step #3: Testing Delay by Query
I called to this URL:
127.0.0.1:81/delaybyquery.json?tid=123
and 5 seconds later I called this other URL
127.0.0.1:81/delaybyquery.json?tid=456
Below is the output, where you can see that calls are working 1 at time.
Each call to an access point is finishing with a query timeout exception.
click here to see the screenshot
This is almost assuredly due to win32ole (the driver that Sequel's ado adapter uses). It probably doesn't release the GVL during queries, which would cause the issues you are seeing.
If you cannot switch to TinyTDS or switch to JRuby, then your only option if you want concurrent queries is to run separate webserver processes, and have a reverse proxy server dispatch requests to them.

How to read a constant stream of NMEA http data using ruby

I have the iPhone app called 'gps2ip', which launches a web server you can visit to get streaming NMEA data.
You can directly connect to this stream using qgis to get an updated location position on your map. I'd like to access this stream programmatically.
If I type in this into my browser url window: http://192.168.1.116:11123 where 192.168.1.116 is the ip of my smartphone as indicated by the gps2ip app
I get a constant stream of newline separated NMEA strings on my safari/chrome/mozilla browser screen, constantly being updated at the bottom with constantly new lines of data.
GPS 2 IP Server started. "exit" to finish.
$GPGGA,005730,3403.415,N,07914.488,W,1,8,0.9,13.6,M,46.9,M,0,2*56
$GPRMC,005730,A,3403.415,N,07914.488,W,0.00,,120115,003.1,W*66
$GPGGA,005730,3403.415,N,07914.488,W,1,8,0.9,13.6,M,46.9,M,0,2*56
$GPRMC,005730,A,3403.415,N,07914.488,W,0.00,,120115,003.1,W*66
$GPGGA,005731,3403.415,N,07914.488,W,1,8,0.9,13.7,M,46.9,M,0,2*56
$GPRMC,005731,A,3403.415,N,07914.488,W,0.00,,120115,003.1,W*67
$GPGGA,005731,3403.415,N,07914.488,W,1,8,0.9,13.7,M,46.9,M,0,2*56
$GPRMC,005731,A,3403.415,N,07914.488,W,0.00,,120115,003.1,W*67
$GPGGA,005732,3403.415,N,07914.488,W,1,8,0.9,13.6,M,46.9,M,0,2*54
$GPRMC,005732,A,3403.415,N,07914.488,W,0.00,,120115,003.1,W*64
$GPGGA,005732,3403.415,N,07914.488,W,1,8,0.9,13.6,M,46.9,M,0,2*54
$GPRMC,005732,A,3403.415,N,07914.488,W,0.00,,120115,003.1,W*64
$GPGGA,005733,3403.415,N,07914.488,W,1,8,0.9,13.5,M,46.9,M,0,2*56
$GPRMC,005733,A,3403.415,N,07914.488,W,0.00,,120115,003.1,W*65
$GPGGA,005733,3403.415,N,07914.488,W,1,8,0.9,13.5,M,46.9,M,0,2*56
$GPRMC,005733,A,3403.415,N,07914.488,W,0.00,,120115,003.1,W*65
$GPGGA,005734,3403.415,N,07914.488,W,1,8,0.9,13.4,M,46.9,M,0,2*50
$GPRMC,005734,A,3403.415,N,07914.488,W,0.00,,120115,003.1,W*62
$GPGGA,005734,3403.415,N,07914.488,W,1,8,0.9,13.4,M,46.9,M,0,2*50
$GPRMC,005734,A,3403.415,N,07914.488,W,0.00,,120115,003.1,W*62
$GPGGA,005735,3403.415,N,07914.488,W,1,8,0.9,13.3,M,46.9,M,0,2*56
$GPRMC,005735,A,3403.415,N,07914.488,W,0.00,,120115,003.1,W*63
$GPGGA,005735,3403.415,N,07914.488,W,1,8,0.9,13.3,M,46.9,M,0,2*56
$GPRMC,005735,A,3403.415,N,07914.488,W,0.00,,120115,003.1,W*63
$GPGGA,005736,3403.415,N,07914.488,W,1,8,0.9,13.2,M,46.9,M,0,2*54
$GPRMC,005736,A,3403.415,N,07914.488,W,0.00,,120115,003.1,W*60
$GPGGA,005736,3403.415,N,07914.488,W,1,8,0.9,13.2,M,46.9,M,0,2*54
$GPRMC,005736,A,3403.415,N,07914.488,W,0.00,,120115,003.1,W*60
$GPGGA,005737,3403.415,N,07914.488,W,1,8,0.9,13.4,M,46.9,M,0,2*53
$GPRMC,005737,A,3403.415,N,07914.488,W,0.00,,120115,003.1,W*61
$GPGGA,005737,3403.415,N,07914.488,W,1,8,0.9,13.4,M,46.9,M,0,2*53
$GPRMC,005737,A,3403.415,N,07914.488,W,0.00,,120115,003.1,W*61
$GPGGA,005738,3403.415,N,07914.488,W,1,8,0.9,13.4,M,46.9,M,0,2*5C
$GPRMC,005738,A,3403.415,N,07914.488,W,0.00,,120115,003.1,W*6E
$GPGGA,005738,3403.415,N,07914.488,W,1,8,0.9,13.4,M,46.9,M,0,2*5C
$GPRMC,005738,A,3403.415,N,07914.488,W,0.00,,120115,003.1,W*6E
$GPGGA,005739,3403.415,N,07914.488,W,1,8,0.9,13.4,M,46.9,M,0,2*5D
$GPRMC,005739,A,3403.415,N,07914.488,W,0.00,,120115,003.1,W*6F
$GPGGA,005739,3403.415,N,07914.488,W,1,8,0.9,13.4,M,46.9,M,0,2*5D
I know how to parse these lines of NMEA code into latitude/longitude pairs, I just need to be able to access "the last one" easily inside a ruby environment.
I want to be able to pluck the last one or two lines and parse the NMEA strings manually, but I haven't figured out a way to "sip from the firehose" of data without generating an error message.
When I try this:
require 'open-uri'
open("http://192.168.1.116:11123")
I get this error:
Net::HTTPBadResponse: wrong status line: "GPS 2 IP Server started. \"exit\" to finish."
Where "GPS 2 IP Server Started. 'exit' to finish." is of course the first line of the response.
What ruby gem should I use to sip from this firehose of data? Apparently open-uri wants html headers and my stream has none of that. I just need to stream pure text apparently.
Since this isn't the HTTP protocol, you'll need the more generic Socket. You can do something like this:
require 'socket'
s = TCPSocket.new '192.168.1.116', 11123
while line = s.gets
puts line
end
s.close
Depending on how fast the data arrives and how long it takes to process each line, you may need to investigate putting each line into a queue such as Sidekiq so that multiple workers can process lines simultaneously.
Since what you have is a never-ending stream of data, you can't just grab the last message. You must consume the stream, and decide when you've received enough to start your processing. You could probably take advantage of the GPRMC message to do that.
In nmea_plus (full disclosure, I wrote it), you could do it this way:
require 'socket'
require 'nmea_plus'
allowable_delay = 3 # threshold to consider the stream as fresh
caught_up = false # flag to say whether we've crossed the threshold
io_source = TCPSocket.new('192.168.1.116', 11123)
source_decoder = NMEAPlus::SourceDecoder.new(io_source)
source_decoder.each_message do |msg|
case msg.data_type
when 'GPRMC'
if Time.now - msg.utc_time < allowable_delay
caught_up = true
end
when 'GPGLL'
if caught_up
puts "Fix: #{msg.latitude}, #{msg.longitude}"
end
end
end
By default, the decoder ignores any lines that don't parse as NMEA.

Where/how can I host my small, database-driven, Sinatra-powered Ruby app? Heroku limits database connections

Here is my app: a price inflation calculator. If you click 'Calculate,' you'll probably get a server-related error. When I tested the app out in Sinatra, I got an error saying
PG::Error: FATAL: too many connections for role "********"
After looking into it, it turns out that Heroku limits its free databases to 20 connections.
I'd like to continue developing small apps (especially database-driven ones) on Heroku, but I may not be able to if I'm unable to get around this restriction. I would pay $50/month to get a database that allows more connections, but I do not yet know if it would be worth it.
My question is: Does anyone know if it's possible to get around this restriction for free, or if there are alternatives to Heroku that can host a database-driven Sinatra app?
Here's the code I use that adds inflation data to the database:
require 'rubygems'
require 'rest-client'
require 'nokogiri'
require 'sequel'
### MAKE CPI DATABASE ###
db_name = 'DATABASE_NAME_HERE'
DB = Sequel.postgres(db_name,:user=>'USER_NAME',:password=>'PASSWORD',:host=>'HOST',:port=>5432,:sslmode=>'require')
DB.create_table! :cpi_nsa_annual do
primary_key :id
Integer :year
Float :cpi
end # DONE: DB.create_table :cpi_nsa_annual do
cpi_annual = DB[:cpi_nsa_annual]
### DONE MAKING CPI DATABASE ###
post_url = "http://data.bls.gov/pdq/SurveyOutputServlet"
post_params = {
'delimiter'=>'comma',
'output_format'=>'html',
'output_type'=>'column',
'periods_option'=>'all_periods',
'series_id'=>'CUUR0000SA0',
'years_option'=>'all_years'
}
if page = RestClient.post(post_url,post_params)
npage = Nokogiri::HTML(page)
data = npage.css('table.regular-data tbody tr')
data.each{|row|
month_prefix = (row.css('th')[2].text)[0]
year = row.css('th')[1].text
month = (row.css('th')[2].text)[1..2]
cpi = row.css('td').text
if month_prefix=='M' and month=='13'
cpi_annual.insert(
:year=>year,
:cpi=>cpi
) # DONE: cpi_annual_insert
p ["YEAR",year,cpi]
end # DONE: month_prefix=='M' and month!='13'
}
end # DONE: if page
p cpi_annual.each{|row| p row}
It really doesn't seem like you need a database to accomplish what you want in that app. Can't you just store the annual rates in an array in your ruby code?
$inflation_rates = {"1999" => 2.19, "2000" => 2.97, "2001" => 3.73}
Or maybe I'm misunderstanding how your app works.
I've hosted small Sinatra apps with databases on Heroku without problems. The 20 connection limit has been enough. Are you sure its not a fault of your app? Maybe its using an excessive number of connections? Could you post your code?
Also, OpenKeyVal is an awesome free key/value data store with no connection limit (the only limit is the key must be under 64KB). It might force you to change your code around but its at least worth looking into. Because you can store data with JSONP calls you can build an app that uses the datastore in a single static html file.
meub's answer suggesting you store the data outside of a database is sound, but if you feel like you really want to stick with a database try looking into why your app is using so many connections.
If you run select * from pg_stat_database WHERE datname = 'yourdbname' the 'numbackends' field will tell you how many connections are being made to your database. If you do a fresh deploy to heroku and then visit your app a few times does the number of connections increase? Perhaps you need to close your connection.
If you add DB[:cpi_nsa_annual].disconnect at the end of your code does the number of connections stop going up with each page load?

Ruby IMAP "changes" since last check

I'm working on an IMAP client using Ruby and Rails. I can successfully import messages, mailboxes, and more... However, after the initial import, how can I detect any changes that have occurred since my last sync?
Currently I am storing the UIDs and UID validity values in the database, comparing them, and searching appropriately. This works, but it doesn't detect deleted messages or changes to message flags, etc.
Do I have to pull all messages every time to detect these changes? How do other IMAP clients do it so quickly (i.e. Apple Mail and Postbox). My script is already taking 10+ seconds per account with very few email addresses:
# select ourself as the current mailbox
#imap_connection.examine(self.location)
# grab all new messages and update them in the database
# if the uid's are still valid, we will just fetch the newest UIDs
# otherwise, we need to search when we last synced, which is slower :(
if self.uid_validity.nil? || uid_validity == self.uid_validity
# for some IMAP servers, if a mailbox is empty, a uid_fetch will fail, so then
begin
messages = #imap_connection.uid_fetch(uid_range, ['UID', 'RFC822', 'FLAGS'])
rescue
# gmail cries if the folder is empty
uids = #imap_connection.uid_search(['ALL'])
messages = #imap_connection.uid_fetch(uids, ['UID', 'RFC822', 'FLAGS']) unless uids.empty?
end
messages.each do |imap_message|
Message.create_from_imap!(imap_message, self.id)
end unless messages.nil?
else
query = self.last_synced.nil? ? ['All'] : ['SINCE', Net::IMAP.format_datetime(self.last_synced)]
#imap_connection.search(query).each do |message_id|
imap_message = #imap_connection.fetch(message_id, ['RFC822', 'FLAGS', 'UID'])[0]
# don't mark the messages as read
##imap_connection.store(message_id, '-FLAGS', [:Seen])
Message.create_from_imap!(imap_message, self.id)
end
end
# now assume all UIDs are valid
self.uid_validity = uid_validity
# now remember that we just fetched all those messages
self.last_synced = Time.now
self.save!
There is an IMAP extension for Quick Flag Changes Resynchronization (RFC-4551). With this extension it is possible to search for all messages that have been changed since the last synchronization (based on some kind of timestamp). However, as far as I know this extension is not widely supported.
There is an informational RFC that describes how IMAP clients should do synchronization (RFC-4549, section 4.3). The text recommends issuing the following two commands:
tag1 UID FETCH <lastseenuid+1>:* <descriptors>
tag2 UID FETCH 1:<lastseenuid> FLAGS
The first command is used to fetch the required information for all unknown mails (without knowing how many mails there are). The second command is used to synchronize the flags for the already seen mails.
AFAIK this method is widely used. Therefore, many IMAP servers contain optimizations in order to provide this information quickly. Typically, the network bandwidth is the limiting factor.
The IMAP protocol is brain dead this way, unfortunately. IDLE really should be able to return this kind of stuff while connected, for example. The FETCH FLAGS suggestion above is the only way to do it.
One thing to be careful of, however, is that UIDs are only valid for a given session per the spec. You should not store them, even if some servers persist them.

Resources