How to stop a background thread in Sinatra once the connection is closed - ruby

I'm trying to consume the twitter streaming API with Sinatra and give users real-time updates when they search for a keyword.
require 'sinatra'
require 'eventmachine'
require 'em-http'
require 'json'
STREAMING_URL = 'https://stream.twitter.com/1/statuses/sample.json'
get '/' do
stream(:keep_open) do |out|
http = EM::HttpRequest.new(STREAMING_URL).get :head => { 'Authorization' => [ 'USERNAME', 'PASS' ] }
buffer = ""
http.stream do |chunk|
puts "still chugging"
buffer += chunk
while line = buffer.slice!(/.+\r?\n/)
tweet = JSON.parse(line)
unless tweet.length == 0 or tweet['user'].nil?
out << "<p><b>#{tweet['user']['screen_name']}</b>: #{tweet['text']}</p>"
end
end
end
end
end
I want the processing of the em-http-request stream to stop if the user closes the connection. Does anyone know how to do this?

Eric's answer was close, but what it does is closing the response body (not the client connection, btw) once your twitter stream closes, which normally never happens. This should work:
require 'sinatra/streaming' # gem install sinatra-contrib
# ...
get '/' do
stream(:keep_open) do |out|
# ...
out.callback { http.conn.close_connection }
out.errback { http.conn.close_connection }
end
end

I'm not quite familiar with the Sinatra stream API yet, but did you try this?
http.callback { out.close }

Related

Efficient way to render ton of JSON on Heroku

I built a simple API with one endpoint. It scrapes files and currently has around 30,000 records. I would ideally like to be able to fetch all those records in JSON with one http call.
Here is my Sinatra view code:
require 'sinatra'
require 'json'
require 'mongoid'
Mongoid.identity_map_enabled = false
get '/' do
content_type :json
Book.all
end
I've tried the following:
using multi_json with
require './require.rb'
require 'sinatra'
require 'multi_json'
MultiJson.engine = :yajl
Mongoid.identity_map_enabled = false
get '/' do
content_type :json
MultiJson.encode(Book.all)
end
The problem with this approach is I get Error R14 (Memory quota exceeded). I get the same error when I try to use the 'oj' gem.
I would just concatinate everything one long Redis string, but Heroku's redis service is $30 per month for the instance size I would need (> 10mb).
My current solution is to use background task that creates objects and stuffs them full of jsonified objects at near the Mongoid object size limit (16mb). The problems with this approach: It still takes nearly 30 seconds to render, and I have to run post-processing on the receiving app to properly extract the json from the objects.
Does anyone have any better idea for how I can render json for 30k records in one call without switching away from Heroku?
Sounds like you want to stream the JSON directly to the client instead of building it all up in memory. It's probably the best way to cut down memory usage. You could for example use yajl to encode JSON directly to a stream.
Edit: I rewrote the entire code for yajl, because its API is much more compelling and allows for much cleaner code. I also included an example for reading the response in chunks. Here's the streamed JSON array helper I wrote:
require 'yajl'
module JsonArray
class StreamWriter
def initialize(out)
super()
#out = out
#encoder = Yajl::Encoder.new
#first = true
end
def <<(object)
#out << ',' unless #first
#out << #encoder.encode(object)
#out << "\n"
#first = false
end
end
def self.write_stream(app, &block)
app.stream do |out|
out << '['
block.call StreamWriter.new(out)
out << ']'
end
end
end
Usage:
require 'sinatra'
require 'mongoid'
Mongoid.identity_map_enabled = false
# use a server that supports streaming
set :server, :thin
get '/' do
content_type :json
JsonArray.write_stream(self) do |json|
Book.all.each do |book|
json << book.attributes
end
end
end
To decode on the client side you can read and parse the response in chunks, for example with em-http. Note that this solution requires the clients memory to be large enough to store the entire objects array. Here's the corresponding streamed parser helper:
require 'yajl'
module JsonArray
class StreamParser
def initialize(&callback)
#parser = Yajl::Parser.new
#parser.on_parse_complete = callback
end
def <<(str)
#parser << str
end
end
def self.parse_stream(&callback)
StreamParser.new(&callback)
end
end
Usage:
require 'em-http'
parser = JsonArray.parse_stream do |object|
# block is called when we are done parsing the
# entire array; now we can handle the data
p object
end
EventMachine.run do
http = EventMachine::HttpRequest.new('http://localhost:4567').get
http.stream do |chunk|
parser << chunk
end
http.callback do
EventMachine.stop
end
end
Alternative solution
You could actually simplify the whole thing a lot when you give up the need for generating a "proper" JSON array. What the above solution generates is JSON in this form:
[{ ... book_1 ... }
,{ ... book_2 ... }
,{ ... book_3 ... }
...
,{ ... book_n ... }
]
We could however stream each book as a separate JSON and thus reduce the format to the following:
{ ... book_1 ... }
{ ... book_2 ... }
{ ... book_3 ... }
...
{ ... book_n ... }
The code on the server would then be much simpler:
require 'sinatra'
require 'mongoid'
require 'yajl'
Mongoid.identity_map_enabled = false
set :server, :thin
get '/' do
content_type :json
encoder = Yajl::Encoder.new
stream do |out|
Book.all.each do |book|
out << encoder.encode(book.attributes) << "\n"
end
end
end
As well as the client:
require 'em-http'
require 'yajl'
parser = Yajl::Parser.new
parser.on_parse_complete = Proc.new do |book|
# this will now be called separately for every book
p book
end
EventMachine.run do
http = EventMachine::HttpRequest.new('http://localhost:4567').get
http.stream do |chunk|
parser << chunk
end
http.callback do
EventMachine.stop
end
end
The great thing is that now the client does not have to wait for the entire response, but instead parses every book separately. However, this will not work if one of your clients expects one single big JSON array.

Ctrl+C not killing Sinatra + EM::WebSocket servers

I'm building a Ruby app that runs both an EM::WebSocket server as well as a Sinatra server. Individually, I believe both of these are equipped to handle a SIGINT. However, when running both in the same app, the app continues when I press Ctrl+C. My assumption is that one of them is capturing the SIGINT, preventing the other from capturing it as well. I'm not sure how to go about fixing it, though.
Here's the code in a nutshell:
require 'thin'
require 'sinatra/base'
require 'em-websocket'
EventMachine.run do
class Web::Server < Sinatra::Base
get('/') { erb :index }
run!(port: 3000)
end
EM::WebSocket.start(port: 3001) do |ws|
# connect/disconnect handlers
end
end
I had the same issue. The key for me seemed to be to start Thin in the reactor loop with signals: false:
Thin::Server.start(
App, '0.0.0.0', 3000,
signals: false
)
This is complete code for a simple chat server:
require 'thin'
require 'sinatra/base'
require 'em-websocket'
class App < Sinatra::Base
# threaded - False: Will take requests on the reactor thread
# True: Will queue request for background thread
configure do
set :threaded, false
end
get '/' do
erb :index
end
end
EventMachine.run do
# hit Control + C to stop
Signal.trap("INT") {
puts "Shutting down"
EventMachine.stop
}
Signal.trap("TERM") {
puts "Shutting down"
EventMachine.stop
}
#clients = []
EM::WebSocket.start(:host => '0.0.0.0', :port => '3001') do |ws|
ws.onopen do |handshake|
#clients << ws
ws.send "Connected to #{handshake.path}."
end
ws.onclose do
ws.send "Closed."
#clients.delete ws
end
ws.onmessage do |msg|
puts "Received message: #{msg}"
#clients.each do |socket|
socket.send msg
end
end
end
Thin::Server.start(
App, '0.0.0.0', 3000,
signals: false
)
end
I downgrade thin to version 1.5.1 and it just works. Wired.

Responding to HTTP Requests in Eventmachine

I have a very simple server for use in integration tests, built using eventmachine:
EM.run do
EM::start_server(server, port, HttpRecipient)
end
I can receive HTTP requests and parse them like so:
class HttpRecipient < EM::Connection
def initialize
##stored = ''
end
# Data is received in chunks, so here we wait until we've got a full web request before
# calling spool.
def receive_data(data)
##stored << data
begin
spool(##stored)
EM.stop
rescue WEBrick::HTTPStatus::BadRequest
#Not received a complete request yet
end
end
def spool(data)
#Parse the request
req = WEBrick::HTTPRequest.new(WEBrick::Config::HTTP)
req.parse(StringIO.new(##stored))
#Send a response, e.g. HTTP OK
end
end
The question is, how do I send a response? Eventmachine provides a send_data for sending responses, but that doesn't understand http. Similarly there is the em-http-request
module for sending requests, but it's not obvious that this is capable of generating responses.
I can generate HTTP messages manually and then send them using send_data, but I wonder if there is a clean way to use an existing http library, or the functionality built in to eventmachine?
If you want something easy then use Thin or Rainbows. It uses Eventmachine inside and provides Rack interface support.
# config.ru
http_server = proc do |env|
response = "Hello World!"
[200, {"Connection" => "close", "Content-Length" => response.bytesize.to_s}, [response]]
end
run http_server
And then
>> thin start -R config.ru
UPD.
If you need server to run in parallel you could run it in a Thread
require 'thin'
class ThreadedServer
def initialize(*args)
#server = Thin::Server.new(*args)
end
def start
#thread = Thread.start do
#server.start
end
end
def stop
#server.stop
if #thread
#thread.join
#thread = nil
end
end
end
http_server = proc do |env|
response = "Hello World!"
[200, {"Connection" => "close", "Content-Length" => response.bytesize.to_s}, [response]]
end
server = ThreadedServer.new http_server
server.start
# Some job with server
server.stop
# Server is down

em-http-request unexpected result when using tor as proxy

I've created a gist which shows exactly what happens.
https://gist.github.com/4418148
I've tested a version which used ruby's 'net/http' library and 'socksify/http' and it worked perfect but if the EventMachine version returns an unexpected result.
The response in Tor Browser is correct but using EventMachine is not!
It return a response but it's not the same as returned response when you send the request via browser, net/http with or without proxy.
For convenience, I will also paste it here.
require 'em-http-request'
DEL = '-'*40
#results = 0
def run_with_proxy
connection_opts = {:proxy => {:host => '127.0.0.1', :port => 9050, :type => :socks5}}
conn = EM::HttpRequest.new("http://www.apolista.de/tegernsee/kloster-apotheke", connection_opts)
http = conn.get
http.callback {
if http.response.include? "Oops"
puts "#{DEL}failed with proxy#{DEL}", http.response
else
puts "#{DEL}success with proxy#{DEL}", http.response
end
#results+=1
EM.stop_event_loop if #results == 2
}
end
def run_without_proxy
conn = EM::HttpRequest.new("http://www.apolista.de/tegernsee/kloster-apotheke")
http = conn.get
http.callback {
if http.response.include? "Oops"
puts "#{DEL}failed without proxy#{DEL}", http.response
else
puts "#{DEL}success without proxy#{DEL}", http.response
end
#results+=1
EM.stop_event_loop if #results == 2
}
end
EM.run do
run_with_proxy
run_without_proxy
end
Appreciate any clarification.

Streaming on Ruby Thin server

I tried the following ruby code...
self.response.headers["Cache-Control"] ||= "no-cache"
self.response.headers["Transfer-Encoding"] = "chunked"
self.response.headers['Last-Modified'] = Time.now.ctime.to_s
self.response_body = Rack::Chunked::Body.new(Enumerator.new do |y|
10.times do
sleep 1
y << "Hello World\n"
end
end)
This works great in Unicron server but can't stream using Thin server. I tried 1.5.0 and 2.0.0.pre too, this is not working in thin.
I tried the following rack code,
class DeferredBody
def each(block)
#server_block = block
end
def send(data)
#server_block.call data
end
end
class RackStreamApp
def self.call(env)
Thread.new do
sleep 2 # simulate waiting for some event
body = DeferredBody.new
response = [200, {'Content-Type' => 'text/plain'}, body]
env['async.callback'].call response
body.send 'Hello, '
sleep 2
body.send 'World'
end
[-1, {}, []] # or throw :async
end
end
The above code streams "Hello, World" if we use Unicorn Server, but the code doesn't stream using Thin server 1.5.0 ( I tried 2.0.0-pre too)
Is there anything I can do to stream data using the thin server?

Resources