Why does http request fail when running from rake task in a thread? - ruby

I am trying to figure out why the following code does not work. The response is not printed out and from other research the request fails.
task :test_me do
t1 = Thread.new do
puts 'start'
uri = URI.parse("http://google.com/")
response = Net::HTTP.get_response(uri)
puts response.inspect # this line not getting printed
end
# puts t1.value
end
However if I run the following
task :test_me do
t1 = Thread.new do
puts 'start'
uri = URI.parse("http://google.com/")
response = Net::HTTP.get_response(uri)
puts response.inspect # this line is printing because of puts below
end
puts t1.value
end
All is well
Note there are probably many ways to restructure this code, but I have dumbed down the example as far as possible and it's extracted from a gem so I don't have too much control over it.
If I can get a solid reason to why this is not working from a rake task I could potentially go back to them with a PR.
Thanks.

The reason this happens is because you are not calling join after the Thread block. However, when you use .value, it will automatically join the thread for you (as documented here)
Try this:
task :test_me do
t1 = Thread.new do
puts 'start'
uri = URI.parse("http://google.com/")
response = Net::HTTP.get_response(uri)
puts response.inspect
end
t1.join
end

Related

Grabbing JSON data from API with multi-threaded requests

I'm using httparty for making requests and currently have the following code:
def scr(users)
users.times do |id|
test_url = "siteurl/#{id}"
Thread.new do
response = HTTParty.get(test_url)
open('users.json', 'a') do |f|
f.puts "#{response.to_json}, "
end
p "added"
end
end
sleep
end
It works OK for 100-300 records.
I tried adding Thread.exit after sleep, but if I set users to something like 200000, after a while my terminal throws an error. I don't remember what it was but it's something about threads... resource is busy but some records. (About 10000 were added successfully.)
It looks like I'm doing it wrong and need to somehow break requests to batches.
up
here's what I got:
def scr(users)
threads = []
urls = []
users.times do |id|
test_url = "site_url/#{id}"
urls<<test_url
end
urls.each_slice(8) do |batch|
batch.each do |t|
threads << Thread.new do
response = HTTParty.get(t)
response.to_json
end
end
end
all_values = threads.map {|t| t.value}.join(', ')
open('users.json', 'a') do |f|
f.puts all_values
end
On quick inspection, the problem would seem to be that you have a race condition with regards to your JSON file. Even if you don't get an error, you'll definitely get corrupted data.
The simplest solution is probably just to do all the writing at the end:
def scr(users)
threads = []
users.times do |id|
test_url = "siteurl/#{id}"
threads << Thread.new do
response = HTTParty.get(test_url)
response.to_json
end
end
all_values = threads.map {|t| t.value}.join(', ')
open('users.json', 'a') do |f|
f.puts all_values
end
end
Wasn't able to test that, but it should do the trick. It's also better in general to be using Thread#join or Thread#value instead of sleep.

Running Sinatra server and subprocess asynchronously

I am trying to run a process that processes flight tracking data and actively turns it into JSON strings (continuous looping process) alongside a Sinatra server that responds to GET requests with these JSON strings. I am trying to use threading to handle this but have had no success. How can I run these two processes side by side? Here are some more specifics:
I have a class Aircraft with an array of Aircraft objects called Aircraft::All. I have a method that continually updates this array that I want to run alongside a Sinatra server that responds to GET requests with the list of aircraft in JSON format.
Here is the code:
# starting the data stream from external process
IO.popen("./dump1090") do |data|
block = ""
# created sinatra server thread
t1 = Thread.new do
set :port, 8080
set :environment, :production
get '/aircrafts' do
return_message = {}
if !Aircraft::All.first.nil?
return_message[:status] == 'success'
return_message[:aircrafts] = message_maker
else
return_message[:status] = 'sorry - something went wrong'
return_message[:aircrafts] = []
end
return_message.to_json
end
end
# parsing the data in main thread -- the process
# I want to run alongside the server (parse_block updates Aircraft::All)
while line = data.gets
if line.to_s.split('').first == '*'
parse_block(block)
puts block
Aircraft::All.reject { |aircraft| Time.now.to_f - aircraft.contact_time > 30 }
block = ""
end
block += line.to_s
end
end
Here the main thread is the Sinatra app and the additional thread loads the data, which is more usual to me.
class Aircraft
#aircrafts = {}
def self.all
#aircrafts
end
end
Thread.new do
no = 1
while true
Aircraft.all[no] = 'Boing'
no += 1
sleep(3)
end
end
get '/aircrafts' do
Aircraft.all.to_json
end

How can I use EventMachine from within a Sinatra app?

I use an api, that is written on top of EM. This means that to make a call, I need to write something like the following:
EventMachine.run do
api.query do |result|
# Do stuff with result
end
EventMachine.stop
end
Works fine.
But now I want to use this same API within a Sinatra controller. I tried this:
get "/foo" do
output = ""
EventMachine.run do
api.query do |result|
output = "Result: #{result}"
end
EventMachine.stop
end
output
end
But this doesn't work. The run block is bypassed, so an empty response is returned and once stop is called, Sinatra shuts down.
Not sure if it's relevant, but my Sinatra app runs on Thin.
What am I doing wrong?
I've found a workaround by busy waiting until data becomes available. Possibly not the best solution, but it works at least:
helpers do
def wait_for(&block)
while (return_val = block.call).nil?
sleep(0.1)
end
return_val
end
end
get "/foo" do
output = nil
EventMachine.run do
api.query do |result|
output = "Result: #{result}"
end
end
wait_for { output }
end

EventMachine with em-synchrony I need to correctly throttle my http requests

I have a consumer which pulls messages off of a queue via an evented subscription. It takes those messages and then connects with a rather slow http interface. I have a worker pool of 8 and once those are all filled up I need to stop pulling requests from the queue and have the fibers that are working on the http jobs keep working. Here is an example I've thrown together.
def send_request(callback)
EM.synchrony do
while $available <= 0
sleep 2
puts "sleeping"
end
url = 'http://example.com/api/Restaurant/11111/images/?image%5Bremote_url%5D=https%3A%2F%2Firs2.4sqi.net%2Fimg%2Fgeneral%2Foriginal%2F8NMM4yhwsLfxF-wgW0GA8IJRJO8pY4qbmCXuOPEsUTU.jpg&image%5Bsource_type_enum%5D=3'
result = EM::Synchrony.sync EventMachine::HttpRequest.new(url, :inactivity_timeout => 0).send("apost", :head => {:Accept => 'services.v1'})
callback.call(result.response)
end
end
def display(value)
$available += 1
puts value.inspect
end
$available = 8
EM.run do
EM.add_periodic_timer(0.001) do
$available -= 1
puts "Available: #{$available}"
puts "Tick ..."
puts send_request(method(:display))
end
end
I have found that if I call sleep within a while loop in the synchrony block, the reactor loop gets stuck. If I call sleep within an if statement(sleeping just once) then most times it is enough time for the requests to finish but it is unreliable at best. If I use EM::Synchrony.sleep, then the main reactor loop will keep creating new requests.
Is there a way to pause the main loop but have the fibers finish their execution?
sleep 2
...
add_periodic_timer(0.001)
Are you serious?
Have you ever though how many send_request's are sleeping in the loop? And it's adding 1000 every second.
What about this:
require 'eventmachine'
require 'em-http'
require 'fiber'
class Worker
URL = 'http://example.com/api/whatever'
def initialize callback
#callback = callback
end
def work
f = Fiber.current
loop do
http = EventMachine::HttpRequest.new(URL).get :timeout => 20
http.callback do
#callback.call http.response
f.resume
end
http.errback do
f.resume
end
Fiber.yield
end
end
end
def display(value)
puts "Done: #{value.size}"
end
EventMachine.run do
8.times do
Fiber.new do
Worker.new(method(:display)).work
end.resume
end
end

Thread lockup in ruby with Soap4r

This is related to a question I asked here:
Thread Locking in Ruby (use of soap4r and QT)
However it is particular to one part of that question and is supported by a simpler example. The test code is:
require 'rubygems'
require 'thread'
require 'soap/rpc/standaloneserver'
class SOAPServer < SOAP::RPC::StandaloneServer
def initialize(* args)
super
# Exposed methods
add_method(self, 'test', 'x', 'y')
end
def test(x, y)
return x + y
end
end
myServer = SOAPServer.new('monitorservice', 'urn:ruby:MonitorService', 'localhost', 4004)
Thread.new do
puts 'Starting web services'
myServer.start
puts 'Ending web services'
end
sleep(4)
#Thread.new do
testnum = 0
while testnum < 4000 do
testnum += 1
puts myServer.test(0,testnum)
sleep(2)
end
#end
puts myServer.test(0,4001)
puts myServer.test(0,4002)
puts myServer.test(0,4003)
puts myServer.test(0,4004)
gets
When I run this with the thread commented out everything runs along fine. However, once the thread is put in the process hangs. I poked into Webrick and found that the stop occurs here (the puts are, of course, mine):
while #status == :Running
begin
puts "1.1"
if svrs = IO.select(#listeners, nil, nil, 2.0)
svrs[0].each{|svr|
puts "-+-"
#tokens.pop # blocks while no token is there.
if sock = accept_client(svr)
th = start_thread(sock, &block)
th[:WEBrickThread] = true
thgroup.add(th)
else
#tokens.push(nil)
end
}
end
puts ".+."
When run with the thread NOT commented out I get something like this:
Starting web services
1.1
.+.
1.1
4001
4002
4003
4004
1
.+.
1.1
If the problem is caused by the gets() call and the purpose of the gets() call in your code is to prevent the Ruby interpreter from exiting, you can replace it with Thread.join() calls for each thread that you create. Join() will block until that thread has finished executing and therefore it'll prevent the Ruby interpreter from exiting.
E.g.:
t1 = Thread.new do
puts 'Starting web services'
myServer.start
puts 'Ending web services'
end
t2 = ...
...
t1.join
t2.join
Alternatively, if you can join() only one of the threads if there is a single thread that controls the execution of the application, and the other threads will be killed on exit.
The trailing gets blocks Ruby's IO. I'm not sure why. If it is replaced with pretty much anything the program works. I used a sleeping loop:
loop do
sleep 1
end
ADDED:
I should note that I also get strange behavior with sleep based on the sleep increment. In the end I abandoned Ruby since the threading behavior was too wonky.

Resources