Using Eventmachine http request in a Sidekiq worker - ruby

So lets say I have a sidekiq process that sends off a http post request that I don't want to wait for. I don't want this to be a blocker on the speed of the workers.
One idea I have is to use this simple sample code for EventMachine Http Request
EventMachine.run do
http = EventMachine::HttpRequest.new("http://www.example.com").post :options => {...}
http.callback do
puts "got a response"
puts http.response
EventMachine.stop
end
puts "worker finished"
end
so lets assume my worker process finishes before the callback is called. What will happen here? does this mean the pointer to the call back will fail? I'd like to understand the flow of control here.

Depending on what you need:
You want to utilize CPU
Sidekiq workers are very lightweight. You can run more of them to utilize CPU while waiting responce.
You want workers to finish faster.
You can enqueue each request to be proccessed by different worker. It will be like next_tick() in EM.
I'm excited about Sidekiq and Celluloid because it changes the way we think. http://www.slideshare.net/KyleDrake/hybrid-concurrency-patterns?utm_source=rubyweekly&utm_medium=email

The EventMachine.run block will not return until you call EventMachine.stop. So, on your case, the worker won't finish without the callback being run.

Related

Call EventMachine defer within callback?

I'm using EventMachine.defer to handle some long-running processes (an indefinite wait for a response from an outside application). I want to do this in a loop: each time the application responds, I process the response and then immediately want to start waiting for the next response.
My code currently looks like this:
def watch_for_songs_change
EM.defer(
->( ){ `mpc idle playlist` }, # wait for the song list to change
->(_){ update_songs; watch_for_songs_change }
)
end
I realized that this is calling defer from within a callback from defer. Is this valid? Am I spawning one thread from inside another, and will eventually run out of threads? Or does EventMachine invoke the callback after it has returned the thread to the pool?
I've tried to chain calls like this before in EM, and found that using periodic timers is a usually a better design.
#timer = EventMachine.add_periodic_timer( 1 ) { `mpc idle playlist` and update_songs }

Which of Ruby's concurrency devices would be best suited for this scenario?

The whole threads/fibers/processes thing is confusing me a little. I have a practical problem that can be solved with some concurrency, so I thought this was a good opportunity to ask professionals and people more knowledgable than me about it.
I have a long array, let's say 3,000 items. I want to send a HTTP request for each item in the array.
Actually iterating over the array, generating requests, and sending them is very rapid. What takes time is waiting for each item to be received, processed, and acknowledged by the party I'm sending to. I'm essentially sending 100 bytes, waiting 2 seconds, sending 100 bytes, waiting 2 seconds.
What I would like to do instead is send these requests asynchronously. I want to send a request, specify what to do when I get the response, and in the meantime, send the next request.
From what I can see, there are four concurrency options I could use here.
Threads.
Fibers.
Processes; unsuitable as far as I know because multiple processes accessing the same array isn't feasible/safe.
Asynchronous functionality like JavaScript's XMLHttpRequest.
The simplest would seem to be the last one. But what is the best, simplest way to do that using Ruby?
Failing #4, which of the remaining three is the most sensible choice here?
Would any of these options also allow me to say "Have no more than 10 pending requests at any time"?
This is your classic producer/consumer problem and is nicely suited for threads in Ruby. Just create a Queue
urls = [...] # array with bunches of urls
require "thread"
queue = SizedQueue.new(10) # this will only allow 10 items on the queue at once
p1 = Thread.new do
url_slice = urls.each do |url|
response = do_http_request(url)
queue << response
end
queue << "done"
end
consumer = Thread.new do
http_response = queue.pop(true) # don't block when zero items are in queue
Thread.exit if http_response == "done"
process(http_response)
end
# wait for the consumer to finish
consumer.join
EventMachine as an event loop and em-synchrony as a Fiber wrapper for it's callbacks into synchronous code
Copy Paste from em-synchrony README
require "em-synchrony"
require "em-synchrony/em-http"
require "em-synchrony/fiber_iterator"
EM.synchrony do
concurrency = 2
urls = ['http://url.1.com', 'http://url2.com']
results = []
EM::Synchrony::FiberIterator.new(urls, concurrency).each do |url|
resp = EventMachine::HttpRequest.new(url).get
results.push resp.response
end
p results # all completed requests
EventMachine.stop
end
This is an IO bounded case that fits more in both:
Threading model: no problem with MRI Ruby in this case cause threads work well with IO cases; GIL effect is almost zero.
Asynchronous model, which proves(in practice and theory) to be far superior than threads when it comes to IO specific problems.
For this specific case and to make things far simpler, I would have gone with Typhoeus HTTP client which has a parallel support that works as the evented(Asynchronous) concurrency model.
Example:
hydra = Typhoeus::Hydra.new
%w(url1 url2 url3).each do |url|
request = Typhoeus::Request.new(url, followlocation: true)
request.on_complete do |response|
# do something with response
end
hydra.queue(request)
end
hydra.run # this is a blocking call that returns once all requests are complete

Send multiply messages in websocket using threads

I'm making a Ruby server using the em-websocket gem. When a client sends some message (e.g. "thread") the server creates two different threads and sends two anwsers to the client in parallel (I'm actually studying multithreading and websockets). Here's my code:
EM.run {
EM::WebSocket.run(:host => "0.0.0.0", :port => 8080) do |ws|
ws.onmessage { |msg|
puts "Recieved message: #{msg}"
if msg == "thread"
threads = []
threads << a = Thread.new {
sleep(1)
puts "1"
ws.send("Message sent from thread 1")
}
threads << b = Thread.new{
sleep(2)
puts "2"
ws.send("Message sent from thread 2")
}
threads.each { |aThread| aThread.join }
end
How it executes:
I'm sending "thread" message to a server
After one second in my console I see printed string "1". After another second I see "2".
Only after that both messages simultaneously are sent to the client.
The problem is that I want to send messages exactly at the same time when debug output "1" and "2" are sent.
My Ruby version is 1.9.3p194.
I don't have experience with EM, so take this with a pinch of salt.
However, at first glance, it looks like "aThread.join" is actually blocking the "onmessage" method from completing and thus also preventing the "ws.send" from being processed.
Have you tried removing the "threads.each" block?
Edit:
After having tested this in arch linux with both ruby 1.9.3 and 2.0.0 (using "test.html" from the examples of em-websocket), I am sure that even if removing the "threads.each" block doesn't fix the problem for you, you will still have to remove it as Thread#join will suspend the current thread until the "joined" threads are finished.
If you follow the function call of "ws.onmessage" through the source code, you will end up at the Connection#send_data method of the Eventmachine module and find the following within the comments:
Call this method to send data to the remote end of the network connection. It takes a single String argument, which may contain binary data. Data is buffered to be sent at the end of this event loop tick (cycle).
As "onmessage" is blocked by the "join" until both "send" methods have run, the event loop tick cannot finish until both sets of data are buffered and thus, all the data cannot be sent until this time.
If it is still not working for you after removing the "threads.each" block, make sure that you have restarted your eventmachine and try setting the second sleep to 5 seconds instead. I don't know how long a typical event loop takes in eventmachine (and I can't imagine it to be as long as a second), however, the documentation basically says that if several "send" calls are made within the same tick, they will all be sent at the same time. So increasing the time difference will make sure that this is not happening.
I think the problem is that you are calling sleep method, passing 1 to the first thread and 2 to the second thread.
Try removing sleep call on both threads or passing the same value on each call.

Asynchronous HTTP requests with ruby

I have a rabbitmq queue full of requests and I want to send the requests as an HTTP GET asynchronously, without the need to wait for each request response. now I'm confused of what is better to use, threads or just EM ? The way i'm using it at the moment is something like the following , but it would be great to know if there is any better implementation with better performance here since it is a very crucial part of the program :
AMQP.start(:host => "localhost") do |connection|
queue = MQ.queue("some_queue")
queue.subscribe do |body|
EventMachine::HttpRequest.new('http://localhost:9292/faye').post :body => {:message => body.to_json }
end
end
With the code above, is the system will wait for each request to finish before starting the next one ? and if there any tips here I would highly appreciate it
HTTP is synchronous so you have to wait for the replies. If you want to simulate an async environment that you could have a thread pool and pass each request to a thread which will wait for the reply, then go back in the pool until the next request. You would either send the thread a callback function to use when the reply is finished or you would immediately return a future reply object, which allows you to put off waiting for the reply until you actually need the reply data.
The other way is to have a pool of processes each one of which is processing a request, waiting for the reply, etc.
In both cases, you have to have a pool that is big enough or else you will still end up waiting some of the time.

EventMachine, Redis & EM HTTP Request

I try to read URLs from a Redis store and simply fetch the HTTP status of the URLs. All within EventMachine. I don't know what's wrong with my code, but it's not asynchronous like expected.
All requests are fired from the first one to the last one and curiously I only get the first response (the HTTP header I want to check) after the last request. Does anyone have a hint what's going wrong there?
require 'eventmachine'
require 'em-hiredis'
require 'em-http'
EM.run do
#redis = EM::Hiredis.connect
#redis.errback do |code|
puts "Error code: #{code}"
end
#redis.keys("domain:*") do |domains|
domains.each do |domain|
if domain
http = EM::HttpRequest.new("http://www.#{domain}", :connect_timeout => 1).get
http.callback do
puts http.response_header.http_status
end
else
EM.stop
end
end
end
end
I'm running this script for a few thousand domains so I would expect to get the first responses before sending the last request.
While EventMachine is async, the reactor itself is single threaded. So, while your loop is running and firing off those thousands of requests, none of them are being executed until the loop exits. Then, if you call EM.stop, you'll stop the reactor before they execute.
You can use something like EM::iterator to break up the processing of domains into chunks that let the reactor execute. Then you'll need to do some magic if you really want to EM.stop by keeping a counter of the dispatched requests and the received responses before you stop the reactor.

Resources