In-memory stream for Ruby - ruby

When working with Protocol Buffers the real message size becomes known when a whole object is written to IO. So I use following approach: write object to intermediate stream, get it's size and then write whole data, with header containing size int, to TCP socket.
What I do not like in following code is the message_size function which uses real disk file instead of memory stream.
require 'protocol_buffers'
module MyServer
class AuthRequest < ProtocolBuffers::Message
required :int32, :vers, 1
required :int32, :orgID, 2
required :string, :password, 3
end
class MyServer
def self.authenticate(socket, params)
auth = AuthRequest.new(:vers => params[:vers], :orgID => params[:orgID], :password => params[:password])
size = message_size(auth)
if size.present?
socket.write([size, 0].pack 'NN')
auth.serialize(socket)
socket.flush
end
end
def self.message_size(obj)
size = nil
io = File.new('tempfile', 'w')
begin
obj.serialize(io)
io.flush
size = io.stat.size + 4
ensure
io.close
end
size
end
end
end
Controller:
require 'my_server'
require 'socket'
class MyServerTestController < ActionController::Base
def test
socket = TCPSocket.new('192.168.1.15', '12345')
begin
MyServer::MyServer.authenticate(socket, {vers: 1, orgID: 100, password: 'hello'})
ensure
socket.close
end
end
end

You can easily use StringIO as your memory stream. Mind you, it is called StringIO since it is implemented on a string, and it is definitely not bound for it being string data - it works just as easily on binary data:
def self.message_size_mem(obj)
size = nil
io = StringIO.new
begin
obj.serialize(io)
io.flush
size = io.size + 4
ensure
io.close
end
size
end
auth = AuthRequest.new(:vers => 122324, orgID: 9900235, password: 'this is a test for serialization')
MyServer.message_size(auth)
# => 47
MyServer.message_size_mem(auth)
# => 47
io = StringIO.new
auth.serialize(io)
io.flush
io.string
# => "\bԻ\a\u0010ˡ\xDC\u0004\u001A this is a test for serialization"

Related

ruby-snmp: how to automatically convert response to its proper type?

In ruby I read some SNMP registers. Response is an array of objects.
Is there a nice way to convert each object to the proper type avoiding the case..when in the following code? It looks strange that it must be converted manually as the type is already known:
require 'snmp'
HOST = '127.0.0.1'.freeze
registers = ['sysContact.0', 'sysUpTime.0',
'upsIdentManufacturer.0', 'upsIdentModel.0', 'upsIdentName.0']
params_array = {}
SNMP::Manager.open(host: HOST) do |manager|
manager.load_module('UPS-MIB')
response = manager.get(registers)
response.each_varbind do |vb|
##################################
# change from here...
value = nil
case vb.value.asn1_type
when 'OCTET STRING' # <==========
value = vb.value
when 'INTEGER' # <==========
value = vb.value.to_i
when 'TimeTicks' # <==========
value = vb.value.to_s
else
puts "Type '#{vb.value.asn1_type}' not recognized!"
exit(1)
end
params_array[vb.name.to_s] = value
# ... to here
##################################
# with something like
# params_array[vb.name.to_s] = vb.value._to_its_proper_type_
end
end
pp params_array
Looking at the code in the gem repo, it doesn't look like like there is a method for this. I suppose you could try to monkey patch it, but not sure if it's worth the trouble.
If you don't like the switch syntax, you could just use a hash lookup like this:
require 'snmp'
HOST = '127.0.0.1'.freeze
TYPE_VALUES = {
'OCTET STRING' => :to_s,
'INTEGER' => :to_i,
'TimeTicks' => :to_s
}.freeze
registers = ['sysContact.0', 'sysUpTime.0',
'upsIdentManufacturer.0', 'upsIdentModel.0', 'upsIdentName.0']
params_array = {}
SNMP::Manager.open(host: HOST) do |manager|
manager.load_module('UPS-MIB')
response = manager.get(registers)
response.each_varbind do |vb|
if method = TYPE_VALUES[vb.value.ans1_type]
params_array[vb.name.to_s] = vb.value.send(method)
else
puts "Type '#{vb.value.asn1_type}' not recognized!"
exit(1)
end
end
end
pp params_array

Message size varies TCPServer Ruby

I'm working with an AVL (Skypatrol TT8750+) and the messages that it sends (using TCP) are supposed to be 59bytes long but it always sends a first message (the message has some information about the AVL, so the user can identify it) of 33bytes.
So the question is, How can I handle those different size messages on ruby?
require 'socket'
portnumber = 12050
socketServer = TCPServer.open(portnumber)
while true
Thread.new(socketServer.accept) do |connection|
puts "Accepting connection from: #{connection.peeraddr[2]}"
t = Time.now.strftime("%d-%m-%Y %H%M")
file_name = t + '.txt'
out_file = File.new(file_name, "w+")
begin
while connection
incomingData = connection.gets()
if incomingData != nil
incomingData = incomingData
end
hex_line = incomingData.unpack('H*')[0]
out_file.puts(hex_line)
puts "Incoming: #{hex_line}"
end
rescue Exception => e
# Displays Error Message
puts "#{ e } (#{ e.class })"
ensure
connection.close
puts "ensure: Closing"
end
end
end
This is the experimental code that I'm using.
I'm posting this answer to explain a comment I made to Anderson's answer. Most of the code isn't mine.
moving the if out of the loop
When the if statement is within a loop, it will be evaluated each and every time the loop runs, increasing the number of CPU instructions and the complexity of each loop.
You could improve performance by moving the conditional statement out of the loop like so:
require 'socket'
require 'celluloid/io'
portnumber = 12050
socketServer = TCPServer.open(portnumber)
incomingData = nil
while true
Thread.new(socketServer.accept) do |connection|
puts "Accepting connection from: #{connection.peeraddr[2]}"
# this should probably be changed,
# it ignores the possibility of two connections arriving at the same timestamp.
t = Time.now.strftime("%d-%m-%Y %H%M")
file_name = t + '.txt'
out_file = File.new(file_name, "w+")
begin
if connection
incomingData = conection.recv(33)
if incomingData != nil
incomingData = incomingData.unpack('H*')[0]
out_file.puts(incomingData)
puts "Incoming: #{incomingData}"
end
end
while connection
incomingData = connection.recv(59)
if incomingData != nil
incomingData = incomingData.unpack('H*')[0]
out_file.puts(incomingData)
puts "Incoming: #{incomingData}"
end
end
rescue Exception => e
# Displays Error Message
puts "#{ e } (#{ e.class })"
ensure
connection.close
out_file.close
puts "ensure: Closing"
end
end
end
Optimizing the recv method
Another optimization I should probably mention (but won't implement here) would be the recv method call.
This is both an optimization and a possible source for errors that should be addressed.
recv is a system call and as network messages might be combined (or fragmented) across TCP/IP packets, it might become more expensive to call recv than to handle an internal buffer of data that resolved fragmentation and overflow states.
Reconsidering the thread-per-client design
I would also recommend avoiding the thread-per client design.
In general, for a small number of clients it probably doesn't matter much.
However, as clients multiply and threads become busier, you might find the system spends more resources on context switches than actual tasks.
Another concern might be the allocated stack each thread requires (1Mb or 2Mb for Ruby threads, if I remember correctly)... In a best case scenario, 1,000 clients will require more than a GigaByte of memory allocation just for the stack (I'm ignoring kernel structure data table and other resources).
I would consider using EventMachine or Iodine (I'm iodine's author, so I'm biased).
An evented design could save you many resources.
For example (untested):
require 'iodine'
# define the protocol for our service
class ExampleProtocol
#timeout = 10
def on_open
puts "New Connection Accepted."
# this should probably be changed,
# it ignores the possibility of two connections arriving at the same timestamp.
t = Time.now.strftime("%d-%m-%Y %H%M")
file_name = t + '.txt'
#out_file = File.new(file_name, "w+")
# a rolling buffer for fragmented messages
#expecting = 33
#msg = ""
end
def on_message buffer
length = buffer.length
pos = 0
while length >= #expecting
#msg << (buffer[pos, #expecting])
out_file.puts(msg.unpack('H*')[0])
length -= #expecting
pos += #expecting
#expecting = 59
#msg.clear
end
if(length > 0)
#msg << (buffer[pos, length])
#expecting = 59-length
end
end
def on_close
#out_file.close
end
end
# create the service instance
Iodine.listen 12050, ExampleProtocol
# start the service
Iodine.start
The solution was quite simple
require 'socket'
require 'celluloid/io'
portnumber = 12050
socketServer = TCPServer.open(portnumber)
while true
Thread.new(socketServer.accept) do |connection|
puts "Accepting connection from: #{connection.peeraddr[2]}"
t = Time.now.strftime("%d-%m-%Y %H%M")
file_name = t + '.txt'
out_file = File.new(file_name, "w+")
messagecounter = 1
begin
while connection
if messagecounter == 1
incomingData = conection.recv(33)
messagecounter += 1
else
incomingData = connection.recv(59)
end
if incomingData != nil
incomingData = incomingData.unpack('H*')[0]
end
out_file.puts(incomingData)
puts "Incoming: #{incomingData}"
end
rescue Exception => e
# Displays Error Message
puts "#{ e } (#{ e.class })"
ensure
connection.close
puts "ensure: Closing"
end
end
end
I just needed an extra variable and an if to auto increment the variable, and that's it.

Typhoeus Hydra run out of memory

I wrote a script that checks urls from file (using ruby gem Typhoeus). I don't know why when I run my code the memory usage grow. Usually after 10000 urls script crashes.
Is there any solution for it ? Thanks in advance for your help.
My code:
require 'rubygems'
require 'typhoeus'
def run file
log = Logger.new('log')
hydra = Typhoeus::Hydra.new(:max_concurrency => 30)
hydra.disable_memoization
File.open(file).each do |url|
begin
request = Typhoeus::Request.new(url.strip, :method => :get, :follow_location => true)
request.on_complete do |resp|
check_website(url, resp.body)
end
puts "queuing #{ url }"
hydra.queue(request)
request.destroy
rescue Exception => e
log.error e
end
end
hydra.run
end
One approach might be to adapt your file processing - instead of reading a line from the file and immediately creating the request object, try processing them in batches (say 5000 at a time) and throttle your request rate / memory consumption.
I've made improvement to my code, as you suggest I'm processing urls to hydra in batches.
It works with normal memory usage but I don't know why after about 1000 urls it just stop getting new ones. This is very strange, no errors, script is still running but it doesn't send/get new requests. My code:
def run file, concurrency
log = Logger.new('log')
log.info '*** Hydra started ***'
queue = []
File.open(file).each do |uri|
queue << uri
if queue.size == concurrency * 5
hydra = Typhoeus::Hydra.new(:max_concurrency => concurrency)
hydra.disable_memoization
queue.each do |url|
request = Typhoeus::Request.new(url.strip, :method => :get, :follow_location => true, :max_redirections => 2, :timeout => 5000)
request.on_complete do |resp|
check_website(url, resp.body)
puts "#{url} code: #{resp.code} curl_msg #{resp.curl_error_message}"
end
puts "queuing #{url}"
hydra.queue(request)
end
puts 'hydra run'
hydra.run
queue = []
end
end
log.info '*** Hydra finished work ***'
end

Am i using eventmachine in the right way?

I am using ruby-smpp and redis to achive a queue based background worker to send SMPP messages.
And i am wondering if I am using eventmachine in the right way. It works but it doesnt feel right.
#!/usr/bin/env ruby
# Sample SMS gateway that can receive MOs (mobile originated messages) and
# DRs (delivery reports), and send MTs (mobile terminated messages).
# MTs are, in the name of simplicity, entered on the command line in the format
# <sender> <receiver> <message body>
# MOs and DRs will be dumped to standard out.
require 'smpp'
require 'redis/connection/hiredis'
require 'redis'
require 'yajl'
require 'time'
LOGFILE = File.dirname(__FILE__) + "/sms_gateway.log"
PIDFILE = File.dirname(__FILE__) + '/worker_test.pid'
Smpp::Base.logger = Logger.new(LOGFILE)
#Smpp::Base.logger.level = Logger::WARN
REDIS = Redis.new
class MbloxGateway
# MT id counter.
##mt_id = 0
# expose SMPP transceiver's send_mt method
def self.send_mt(sender, receiver, body)
if sender =~ /[a-z]+/i
source_addr_ton = 5
else
source_addr_ton = 2
end
##mt_id += 1
##tx.send_mt(('smpp' + ##mt_id.to_s), sender, receiver, body, {
:source_addr_ton => source_addr_ton
# :service_type => 1,
# :source_addr_ton => 5,
# :source_addr_npi => 0 ,
# :dest_addr_ton => 2,
# :dest_addr_npi => 1,
# :esm_class => 3 ,
# :protocol_id => 0,
# :priority_flag => 0,
# :schedule_delivery_time => nil,
# :validity_period => nil,
# :registered_delivery=> 1,
# :replace_if_present_flag => 0,
# :data_coding => 0,
# :sm_default_msg_id => 0
#
})
end
def logger
Smpp::Base.logger
end
def start(config)
# Write this workers pid to a file
File.open(PIDFILE, 'w') { |f| f << Process.pid }
# The transceiver sends MT messages to the SMSC. It needs a storage with Hash-like
# semantics to map SMSC message IDs to your own message IDs.
pdr_storage = {}
# Run EventMachine in loop so we can reconnect when the SMSC drops our connection.
loop do
EventMachine::run do
##tx = EventMachine::connect(
config[:host],
config[:port],
Smpp::Transceiver,
config,
self # delegate that will receive callbacks on MOs and DRs and other events
)
# Let the connection start before we check for messages
EM.add_timer(3) do
# Maybe there is some better way to do this. IDK, But it works!
EM.defer do
loop do
# Pop a message
message = REDIS.lpop 'messages:send:queue'
if message # If there is a message. Process it and check the queue again
message = Yajl::Parser.parse(message, :check_utf8 => false) # Parse the message from Json to Ruby hash
if !message['send_after'] or (message['send_after'] and Time.parse(message['send_after']) < Time.now)
self.class.send_mt(message['sender'], message['receiver'], message['body']) # Send the message
REDIS.publish 'log:messages', "#{message['sender']} -> #{message['receiver']}: #{message['body']}" # Push the message to the redis queue so we can listen to the channel
else
REDIS.lpush 'messages:queue', Yajl::Encoder.encode(message)
end
else # If there is no message. Sleep for a second
sleep 1
end
end
end
end
end
sleep 2
end
end
# ruby-smpp delegate methods
def mo_received(transceiver, pdu)
logger.info "Delegate: mo_received: from #{pdu.source_addr} to #{pdu.destination_addr}: #{pdu.short_message}"
end
def delivery_report_received(transceiver, pdu)
logger.info "Delegate: delivery_report_received: ref #{pdu.msg_reference} stat #{pdu.stat}"
end
def message_accepted(transceiver, mt_message_id, pdu)
logger.info "Delegate: message_accepted: id #{mt_message_id} smsc ref id: #{pdu.message_id}"
end
def message_rejected(transceiver, mt_message_id, pdu)
logger.info "Delegate: message_rejected: id #{mt_message_id} smsc ref id: #{pdu.message_id}"
end
def bound(transceiver)
logger.info "Delegate: transceiver bound"
end
def unbound(transceiver)
logger.info "Delegate: transceiver unbound"
EventMachine::stop_event_loop
end
end
# Start the Gateway
begin
puts "Starting SMS Gateway. Please check the log at #{LOGFILE}"
# SMPP properties. These parameters work well with the Logica SMPP simulator.
# Consult the SMPP spec or your mobile operator for the correct settings of
# the other properties.
config = {
:host => 'server.com',
:port => 3217,
:system_id => 'user',
:password => 'password',
:system_type => 'type', # default given according to SMPP 3.4 Spec
:interface_version => 52,
:source_ton => 0,
:source_npi => 1,
:destination_ton => 1,
:destination_npi => 1,
:source_address_range => '',
:destination_address_range => '',
:enquire_link_delay_secs => 10
}
gw = MbloxGateway.new
gw.start(config)
rescue Exception => ex
puts "Exception in SMS Gateway: #{ex} at #{ex.backtrace.join("\n")}"
end
Some easy steps to make this code more EventMachine-ish:
Get rid of the blocking Redis driver, use em-hiredis
Stop using defer. Pushing work out to threads with the Redis driver will make things even worse as it relies on locks around the socket it's using.
Get rid of the add_timer(3)
Get rid of the inner loop, replace it by rescheduling a block for the next event loop using EM.next_tick. The outer one is somewhat unnecessary. You shouldn't loop around EM.run as well, it's cleaner to properly handle a disconnect by doing a reconnect in your unbound method instead of stopping and restarting the event loop, by calling the ##tx.reconnect.
Don't sleep, just wait. EventMachine will tell you when new things come in on a network socket.
Here's how the core code around EventMachine would look like with some of the improvements:
def start(config)
File.open(PIDFILE, 'w') { |f| f << Process.pid }
pdr_storage = {}
EventMachine::run do
##tx = EventMachine::connect(
config[:host],
config[:port],
Smpp::Transceiver,
config,
self
)
REDIS = EM::Hiredis.connect
pop_message = lambda do
REDIS.lpop 'messages:send:queue' do |message|
if message # If there is a message. Process it and check the queue again
message = Yajl::Parser.parse(message, :check_utf8 => false) # Parse the message from Json to Ruby hash
if !message['send_after'] or (message['send_after'] and Time.parse(message['send_after']) < Time.now)
self.class.send_mt(message['sender'], message['receiver'], message['body'])
REDIS.publish 'log:messages', "#{message['sender']} -> #{message['receiver']}: #{message['body']}"
else
REDIS.lpush 'messages:queue', Yajl::Encoder.encode(message)
end
end
EM.next_tick &pop_message
end
end
end
end
Not perfect and could use some cleaning up too, but this is more what it should be like in an EventMachine manner. No sleeps, avoid using defer if possible, and don't use network drivers that potentially block, implement traditional loop by rescheduling things on the next reactor loop. In terms of Redis, the difference is not that big, but it's more EventMachine-y this way imho.
Hope this helps. Happy to explain further if you still have questions.
You're doing blocking Redis calls in EM's reactor loop. It works, but isn't the way to go. You could take a look at em-hiredis to properly integrate Redis calls with EM.

streaming html from webrick?

Has anyone tried streaming html/text/content from webrick? I've tried assigning an IO to the response body, but webrick is waiting for the stream to be closed first.
found this link by accident (http://redmine.ruby-lang.org/attachments/download/161) which contains webrick patch
# Copyright (C) 2008 Brian Candler, released under Ruby Licence.
#
# A collection of small monkey-patches to webrick.
require 'webrick'
module WEBrick
class HTTPRequest
# Generate HTTP/1.1 100 continue response. See
# http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/18459
def continue
if self['expect'] == '100-continue' && #config[:HTTPVersion] >= "1.1"
#socket.write "HTTP/#{#config[:HTTPVersion]} 100 continue\r\n\r\n"
#header.delete('expect')
end
end
end
class HTTPResponse
alias :orig_setup_header :setup_header
# Correct termination of streamed HTTP/1.1 responses. See
# http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/18454 and
# http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/18565
def setup_header
orig_setup_header
unless chunked? || #header['content-length']
#header['connection'] = "close"
#keep_alive = false
end
end
# Allow streaming of zipfile entry. See
# http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/18460
def send_body(socket)
if #body.respond_to?(:read) then send_body_io(socket)
elsif #body.respond_to?(:call) then send_body_proc(socket)
else send_body_string(socket)
end
end
# If the response body is a proc, then we invoke it and pass in
# an object which supports "write" and "<<" methods. This allows
# arbitary output streaming.
def send_body_proc(socket)
if #request_method == "HEAD"
# do nothing
elsif chunked?
#body.call(ChunkedWrapper.new(socket, self))
_write_data(socket, "0#{CRLF}#{CRLF}")
else
size = #header['content-length'].to_i
#body.call(socket) # TODO: StreamWrapper which supports offset, size
#sent_size = size
end
end
class ChunkedWrapper
def initialize(socket, resp)
#socket = socket
#resp = resp
end
def write(buf)
return if buf.empty?
data = ""
data << format("%x", buf.size) << CRLF
data << buf << CRLF
socket = #socket
#resp.instance_eval {
_write_data(socket, data)
#sent_size += buf.size
}
end
alias :<< :write
end
end
end
if RUBY_VERSION < "1.9"
old_verbose, $VERBOSE = $VERBOSE, nil
# Increase from default of 4K for efficiency, similar to
# http://svn.ruby-lang.org/cgi-bin/viewvc.cgi/branches/ruby_1_8/lib/net/protocol.rb?r1=11708&r2=12092
# In trunk the default is 64K and can be adjusted using :InputBufferSize,
# :OutputBufferSize
WEBrick::HTTPRequest::BUFSIZE = 16384
WEBrick::HTTPResponse::BUFSIZE = 16384
$VERBOSE = old_verbose
end
to use simply pass a proc to as the response body, like so
res.body = proc { |w|
10.times do
w << Time.now.to_s
sleep(1)
end
}
woot!
I would suggest against using WEBrick for anything really, it's junk. I would say try Mongrel.
I know that wasn't your question, it's just some friendly advice.

Resources