In Ruby, why wrap "yield" in a call you are making anyway? - ruby

I am new to Ruby. I am confused by something I am reading here:
http://alma-connect.github.io/techblog/2014/03/rails-pub-sub.html
They offer this code:
# app/pub_sub/publisher.rb
module Publisher
extend self
# delegate to ActiveSupport::Notifications.instrument
def broadcast_event(event_name, payload={})
if block_given?
ActiveSupport::Notifications.instrument(event_name, payload) do
yield
end
else
ActiveSupport::Notifications.instrument(event_name, payload)
end
end
end
What is the difference between doing this:
ActiveSupport::Notifications.instrument(event_name, payload) do
yield
end
versus doing this:
ActiveSupport::Notifications.instrument(event_name, payload)
yield
If this were another language, I might assume that we first call the method instrument(), and then we call yield so as to call the block. But that is not what they wrote. They show yield being nested inside of ActiveSupport::Notifications.instrument().
Should I assume that ActiveSupport::Notifications.instrument() is returning some kind of iterable, that we will iterate over? Are we calling yield once for every item returned from ActiveSupport::Notifications.instrument()?

While blocks are frequently used for iteration they have many other uses. One is to ensure proper resource cleanup, for example
ActiveRecord::Base.with_connection do
...
end
Checks out a database connection for the thread, yields to the block and then checks the connection back in.
In the specific case of the instrument method you found what it does is add to the event data it is about to broadcast information about the time it's block took to execute. The actual implementation is more complicated but in broad terms it's not so different to
event = Event.new(event_name, payload)
event.start = Time.now
yield
event.end = Time.now
event
The use of yield allows it to wrap the execution of your code with some timing code. In your second example no block is passed to instrument, which detects this and will record it as an event having no duration

The broadcast_event method has been designed to accept an optional block (which allows you to pass a code block to the method).
ActiveSupport::Notifications.instrument also takes an optional block.
Your first example simply takes the block passed in to broadcast_event and forwards it along to ActiveSupport::Notifications.instrument. If there's no block, you can't yield anything, hence the different calls.

Related

Can I send a signal to a method I call indirectly without changing the methods in between?

I have a view which aggregates the result of running a complex computation several times with different inputs. This computation relies on a couple of base methods which are sometimes, but not always, very expensive (on a cache miss). The following is a working example of roughly this situation:
def view
10.times.map do
complicated_process_with_many_methods
end
end
def complicated_process_with_many_methods
[sometimes_expensive_method, sometimes_expensive_method]
end
def sometimes_expensive_method
if rand < 0.5
"expensive result"
else
"inexpensive result"
end
end
puts view
However, sometimes the user just wants to see whatever data we can fetch quickly, and doesn't want to wait a long time for the data that would have to be computed. That is, we should render as much of view as we can without depending on an expensive invocation of sometimes_expensive_method. sometimes_expensive_method can easily determine if this will be an expensive invocation or not, but the information required to do so should not be available to view or to complicated_process_with_many_methods.
I would like to be able to call complicated_process_with_many_methods in such a way that, if ever sometimes_expensive_method turns out to be expensive, we give up and use a default value for that invocation of complicated_process_with_many_methods. However, if nothing turns out to be expensive, then the real result of complicated_process_with_many_methods should be used.
I could add a flag which causes sometimes_expensive_method to behave differently when the result would be expensive:
def sometimes_expensive_method(throw_on_miss)
if rand < 0.5
if throw_on_miss
raise ExpensiveResultException
else
"expensive result"
end
else
"inexpensive result"
end
end
then catch that exception in view and use a default value instead. However, this would require passing that flag through all of the different methods involved in complicated_process_with_many_methods, and I feel like those methods should not have to know about this behavior. I could instead have sometimes_expensive_method call a continuation when the result would be expensive, but that continuation would also have to be passed through all the calls. I don't think a global would help, because different requests may invoke the different behaviors at the same time.
What I'm looking for is some way for sometimes_expensive_method to change its behavior (perhaps by throwing an exception) in response to some sort of signal that is defined in view, without modifying the body of complicated_process_with_many_methods to pass the signal along. Does such a signal exist in ruby?
Try Timeout. See here.
I don't know if it will release the external resources (files or db connections), so use it at your own risk.

Ruby - how to intercept a block and modify it before eval-ing or yield-ing it?

I have been thinking about blocks in Ruby.
Please consider this code:
div {
h2 'Hello world!'
drag
}
This calls the method div(), and passes a block to it.
With yield I can evaluate the block.
h2() is also a method, and so is drag().
Now the thing is - h2() is defined in a module, which
is included. drag() on the other hand resides on an
object and also needs some additional information.
I can provide this at run-time, but not at call-time.
In other words, I need to be able to "intercept"
drag(), change it, and then call that method
on another object.
Is there a way to evaluate yield() line by line
or some other way? I don't have to call yield
yet, it would also be possible to get this
code as string, modify drag(), and then
eval() on it (although this sounds ugly, I
just need to have this available anyway
no mater how).
If I'm understanding you correctly, it seems that you're looking for the .tap method. Tap allows you to access intermediate results within a method chain. Of course, this would require you to restructure how this is set up.
You can kind of do this with instance_eval and a proxy object.
The general idea would be something like this:
class DSLProxyObject
def initialize(proxied_object)
#target = proxied_object
end
def drag
# Do some stuff
#target.drag
end
def method_missing(method, *args, &block)
#target.send(method, *args, &block)
end
end
DSLProxyObject.new(target_object).instance_eval(&block)
You could implement each of your DSL's methods, perform whatever modifications you need to when a method is called, and then call what you need to on the underlying object to make the DSL resolve.
It's difficult to answer your question completely without a less general example, but the general idea is that you would create an object context that has the information you need and which wraps the underlying DSL, then evaluate the DSL block in that context, which would let you intercept and modify individual calls on a per-usage basis.

Fiber within EM:Connection (em-synchrony)

can anybody explain me why the Redis (redis-rb) synchrony driver works directly under EM.synchrony block but doesn't within EM:Connection?
Considering following example
EM.synchrony do
redis = Redis.new(:path => "/usr/local/var/redis.sock")
id = redis.incr "local:id_counter"
puts id
EM.start_server('0.0.0.0', 9999) do |c|
def c.receive_data(data)
redis = Redis.new(:path => "/usr/local/var/redis.sock")
puts redis.incr "local:id_counter"
end
end
end
I'm getting
can't yield from root fiber (FiberError)
when using within receive_data. From reading source code for both EventMachine and em-synchrony I can't figure out what's the difference.
Thanks!
PS: Obvious workaround is to wrap the redis code within EventMachine::Synchrony.next_tick as hinted at issue #59, but given the EM.synchrony I would expect to already have the call wrapped within Fiber...
PPS: same applies for using EM::Synchrony::Iterator
You're doing some rather tricky here.. You're providing a block to start_server, which effectively creates an "anonymous" connection class and executes your block within the post_init method of that class. Then within that class you're defining an instance method.
The thing to keep in mind is: when the reactor executes a callback, or a method like receive_data, that happens on the main thread (and within root fiber), which is why you're seeing this exception. To work around this, you need to wrap each callback to be executed within a Fiber (ex, see Synchrony.add_(periodic)_timer methods).
To address your actual exception: wrap the execution of receive_data within a Fiber. The outer EM.synchrony {} won't do anything for callbacks which are scheduled later by the reactor.

May a Recursive Function Release its own Mutex?

I have some code, a class method on a Ruby class FootballSeries.find(123), which performs an API call… owing to concerns about thread safety, only one thread may enter this method at a time. Due to some recent changes on the API, I have also support the following… FootballSeries.find('Premiership'), the second variety (see implementation below) simply makes an interim call to see if there's an ID can be found, then recursively calling itself using the ID.
class FootballSeries
#find_mutes = Mutex.new
class << self
def find(series_name_or_id)
#find_mutex.synchronize do
if series_name_or_id.is_a?(String)
if doc = search_xml_document(series_name_or_id)
if doc.xpath('//SeriesName').try(:first).try(:content) == series_name_or_id
#find_mutex.unlock
series = find(doc.xpath('//seriesid').first.content.to_i)
#find_mutex.lock
return series
end
end
elsif series_name_or_id.is_a?(Integer)
if doc = xml_document(series_name_or_id)
Series.new(doc)
end
end
end
end
end
end
Without lines 9 and 11, there's a recursive mutex lock: deadlock error (which makes enough sense… therefore my question is, may I release and re-lock the mutex. (I re-lock, so that when synchronize exits, I won't get an error unlocking a mutex that I don't own… but I haven't tested if this is required)
Is this a sane implementation, or would I be better served having find() call two individual methods, each protected with their own mutex? (example find_by_id, and find_by_name)
What I have now works (or at least appears to work).
Finally, bonus points for - how would I test such a method for safety?
This doesn't look good to me, as #find_mutex.unlock will allow another method(s) to enter at the same time. Also, I don't think using recursion is usual for this kind of method dispatch - actually you have two methods stuffed into one. I would certainly separate these two, and if you want to be able to call one method with different argument types, just check the argument's type and invoke one or the other. If you don't need to expose find_by_id and find_by_name, you can make them private, and put mutex.synchronize only in find.

Make dynamic method calls using NoMethodError handler instead of method_missing

I'm trying to make an API for dynamic reloading processes; right now I'm at the point where I want to provide in all contexts a method called reload!, however, I'm implementing this method on an object that has some state (so it can't be on Kernel).
Suppose we have something like
WorkerForker.run_in_worker do
# some code over here...
reload! if some_condition
end
Inside the run_in_worker method there is a code like the following:
begin
worker = Worker.new(pid, stream)
block.call
rescue NoMethodError => e
if (e.message =~ /reload!/)
puts "reload! was called"
worker.reload!
else
raise e
end
end
So I'm doing it this way because I want to make the reload! method available in any nested context, and I don't wanna mess the block I'm receiving with an instance_eval on the worker instance.
So my question is, is there any complications regarding this approach? I don't know if anybody has done this already (haven't read that much code yet), and if it has been done already? Is there a better way to achieve the objective of this code?
Assuming i understand you now, how about this:
my_object = Blah.new
Object.send(:define_method, :reload!) {
my_object.reload!
...
}
Using this method every object that invokes the reload! method is modifying the same shared state since my_object is captured by the block passed to define_method
what's wrong with doing this?
def run_in_worker(&block)
...
worker = Worker.new(pid, stream)
block.call(worker)
end
WorkerForker.run_in_worker do |worker|
worker.reload! if some_condition
end
It sounds like you just want every method to know about an object without the method or the method's owner having been told about it. The way to accomplish this is a global variable. It's not generally considered a good idea (because it leads to concurrency issues, ownership issues, makes unit testing harder, etc.), but if that's what you want, there it is.

Resources