Ruby Async Gem: What is a basic usage example? - ruby

With Ruby 3.0 the async gem is now compatible with blocking IO in standard library functions and I wanted to understand the basic functionality but am already confused by a simple example:
require 'async'
n = 10
n.times do |i|
Async do
HTTParty.get("https://httpbin.org/delay/1.6")
end
end
This doesn't show any parallelism. Looking in the Gem's documentation about Kernel#async says:
Run the given block of code in a task, asynchronously, creating a
reactor if necessary.
But the project documentation seems to clear it up:
When invoked at the top level, will create and run a reactor, and
invoke the block as an asynchronous task. Will block until the reactor
finishes running.
So to make the example from above work:
require 'async'
n = 10
Async do
n.times do |i|
Async do
HTTParty.get("https://httpbin.org/delay/1.6")
end
end
end
This works, but seems confusing to the reader. How would we know as readers that the first Async do is blocking while the others are not?
Thus the question: What is the canonical basic usage of the async gem?
Further reading:
Async Gem Project documentation: Getting started
Blog Post about Async Ruby

Related

How to detect break during yield

For a gem I intend to publish, I want to create an enumerable interface wrapping an external library. (Called via FFI)
I have this code (stripped for clarity)
def each_shape(&block)
callback = lambda do |*args|
yield
end
cpSpaceEachShape(callback) # FFI function, calls callback
end
which is called with a block like this
space.each_shape do
# ...
end
cpSpaceStep() # Other FFI function
The cpSpaceEachShape is an external C library function, which in returns calls callback a number of times (synchronously, no way to cancel it)
This works great, until I use break in the top-level block, like this:
space.each_shape do
break
end
cpSpaceStep() # Error from C library because iteration is still running
I'm not sure how Ruby, FFI and the C library are interacting here. From the Ruby perspective it looks like the break "jumps" out of the top-level block, the callback block, from cbSpaceEachShape and out of each_shape.
Side question: What happens here from a low-level perspective, e.g. what happens to the stack?
Main question: Can I capture the break from the top-level block?
I was hoping for something like this: (not working, pseudo code)
def each_shape(&block)
#each_shape_cancelled = false
callback = lambda do |*args|
yield unless #each_shape_cancelled
rescue StopIteration
#each_shape_cancelled = true
end
cpSpaceEachShape(callback)
end
EDIT: This is for a gem I intend to publish. I want my users to be able to use regular Ruby. If they were required to use throw... How would they know...? If there is no good solution I will - begrudgingly - construct an array beforehand and collect/cache all callbacks before yielding its contents.

What is the use case for future.add_done_callback()?

I understand how to add a callback method to a future and have it called when the future is done. But why is this helpful when you can already call functions from inside coroutines?
Callback version:
def bar(future):
# do stuff using future.result()
...
async def foo(future):
await asyncio.sleep(3)
future.set_result(1)
loop = asyncio.get_event_loop()
future = loop.create_future()
future.add_done_callback(bar)
loop.run_until_complete(foo(future))
Alternative:
async def foo():
await asyncio.sleep(3)
bar(1)
loop = asyncio.get_event_loop()
loop.run_until_complete(foo())
When would the second version not be available/suitable?
In the code as shown, there is no reason to use an explicit future and add_done_callback, you could always await. A more realistic use case is if the situation were reversed, if bar() spawned foo() and needed access to its result:
def bar():
fut = asyncio.create_task(foo())
def when_finished(_fut):
print("foo returned", fut.result())
fut.add_done_callback(when_finished)
If this reminds you of "callback hell", you are on the right track - Future.add_done_callback is a rough equivalent of the then operator of pre-async/await JavaScript promises. (Details differ because then() is a combinator that returns another promise, but the basic idea is the same.)
A large part of asyncio is implemented in this style, using non-async functions that orchestrate async futures. That basic layer of transports and protocols feels like a modernized version of Twisted, with the coroutines and streams implemented as a separate layer on top of it, a higher-level sugar. Application code written using the basic toolset looks like this.
Even when working with non-coroutine callbacks, there is rarely a good reason to use add_done_callback, other than inertia or copy-paste. For example, the above function could be trivially transformed to use await:
def bar():
async def coro():
ret = await foo()
print("foo returned", ret)
asyncio.create_task(coro())
This is more readable than the original, and much much easier to adapt to more complex awaiting scenarios. It is similarly easy to plug coroutines into the lower-level asyncio plumbing.
So, what then are the use cases when one needs to use the Future API and add_done_callback? I can think of several:
Writing new combinators.
Connecting coroutines code with code written in the more traditional callback style, such as this or this.
Writing Python/C code where async def is not readily available.
To illustrate the first point, consider how you would implement a function like asyncio.gather(). It must allow the passed coroutines/futures to run and wait until all of them have finished. Here add_done_callback is a very convenient tool, allowing the function to request notification from all the futures without awaiting them in series. In its most basic form that ignores exception handling and various features, gather() could look like this:
async def gather(*awaitables):
loop = asyncio.get_event_loop()
futs = list(map(asyncio.ensure_future, awaitables))
remaining = len(futs)
finished = loop.create_future()
def fut_done(fut):
nonlocal remaining
remaining -= 1
if not remaining:
finished.set_result(None) # wake up
for fut in futs:
fut.add_done_callback(fut_done)
await finished
# all awaitables done, we can return the results
return tuple(f.result() for f in futs)
Even if you never use add_done_callback, it's a good tool to understand and know about for that rare situation where you actually need it.

Ruby blank environment scope for execution

I'm currently developing a framework that basically executes another application, e.g. rails within the context of another ruby program. My initial attempt was simply to boot the app like this:
def load_app!
# Load the rails application
require './config/application'
# Initialize the rails application
#app = App::Application.initialize!
end
Problem here, is that the framework's requires conflict with the loaded application so the initialize! call never works although it would in a normal ruby program.
So my question is, if anyone knows a method to basically scope this calls into a unit that behaves like a blank RVM environment. So basically a behavior like this:
require 'json'
puts JSON.generate({:name => "test"})
blank_environment do
puts JSON.generate({:name => "test"})
#=> uninitialized constant JSON
require 'json'
puts JSON.generate({:name => "test"})
end
It's not done with undefining or unloading the currently loaded constants because I don't know all of them because I'm using gems that have other dependencies again.
So is there a cool way? Or any other way to handle this?
UPDATE:
Just came across an idea. Why is ruby's require method always requiring for the global scope? Wouldn't it be a very nice feature to actually scope the loaded modules under the the current module?
module ScopeA
require 'json' #> adds support for ScopeA::JSON
# due to normal ruby scoping everything can be called like normal in here
JSON.parse("something")
end
# JSON should not be available here
module ScopeB
require 'yaml'
YAML.parse("something") # but no JSON, of course
end
Doesn't something like this exist? include already has to know the constants...
Thanks in advance!
Well, after some more research it really doesn't seem possible the way I need it.
I now implemented a basic version using distributed ruby, which doesn't quite satisfy me:
require 'drb/drb'
URI = "druby://localhost:8787"
# Blank environment
pid = fork do
Signal.trap("INT") { puts "Stopping Server.."; exit }
class Application
def call(env)
[200,{},""]
end
end
DRb.start_service(URI, Application.new)
DRb.thread.join
end
# Working environment
DRb.start_service
app = DRbObject.new_with_uri(URI)
puts app.call({})
Process.kill("INT", pid)
Process.wait
If anyone comes up with a better approach, it's highly appreciated!

Fiber within EM:Connection (em-synchrony)

can anybody explain me why the Redis (redis-rb) synchrony driver works directly under EM.synchrony block but doesn't within EM:Connection?
Considering following example
EM.synchrony do
redis = Redis.new(:path => "/usr/local/var/redis.sock")
id = redis.incr "local:id_counter"
puts id
EM.start_server('0.0.0.0', 9999) do |c|
def c.receive_data(data)
redis = Redis.new(:path => "/usr/local/var/redis.sock")
puts redis.incr "local:id_counter"
end
end
end
I'm getting
can't yield from root fiber (FiberError)
when using within receive_data. From reading source code for both EventMachine and em-synchrony I can't figure out what's the difference.
Thanks!
PS: Obvious workaround is to wrap the redis code within EventMachine::Synchrony.next_tick as hinted at issue #59, but given the EM.synchrony I would expect to already have the call wrapped within Fiber...
PPS: same applies for using EM::Synchrony::Iterator
You're doing some rather tricky here.. You're providing a block to start_server, which effectively creates an "anonymous" connection class and executes your block within the post_init method of that class. Then within that class you're defining an instance method.
The thing to keep in mind is: when the reactor executes a callback, or a method like receive_data, that happens on the main thread (and within root fiber), which is why you're seeing this exception. To work around this, you need to wrap each callback to be executed within a Fiber (ex, see Synchrony.add_(periodic)_timer methods).
To address your actual exception: wrap the execution of receive_data within a Fiber. The outer EM.synchrony {} won't do anything for callbacks which are scheduled later by the reactor.

Need a ruby solution for executing a method in separate process

I am implementing a poller service whose interface looks like this.
poller = Poller.new(SomeClass)
poller.start
poller.stop
The start method is supposed to continuously start hitting an http request and update stuff in the database. Once started, the process is supposed to continue till it is explicitly stoped.
I understand that implementation of start needs to spawn and run in a new process. I am not quite sure how to achieve that in Ruby. I want a ruby solution instead of a ruby framework specific solution (Not rails plugins or sinatra extensions. Just ruby gems). I am exploring eventmachine and starling-workling. I find eventmachine to be too huge to understand in short span and workling is a plugin and not a gem. So it is a pain get it working for Ruby application.
I need guidance on how do I achieve this. Any pointers? Code samples will help.
Edit
Eventmachine or starling-workling solution would be preferred over threading/forking.
Can't you use the example from Process#kill:
class Poller
def initialize klass
#klass = klass
end
def start
#pid = fork do
Signal.trap("HUP") { puts "Ouch!"; exit }
instance = #klass.new
# ... do some work ...
end
end
def stop
Process.kill("HUP", #pid)
Process.wait
end
end
class Poller
def start
#thread = Thread.new {
#Your code here
}
end
def stop
#thread.stop
end
end
Have you considered out the Daemons gem?

Resources