May a Recursive Function Release its own Mutex? - ruby

I have some code, a class method on a Ruby class FootballSeries.find(123), which performs an API call… owing to concerns about thread safety, only one thread may enter this method at a time. Due to some recent changes on the API, I have also support the following… FootballSeries.find('Premiership'), the second variety (see implementation below) simply makes an interim call to see if there's an ID can be found, then recursively calling itself using the ID.
class FootballSeries
#find_mutes = Mutex.new
class << self
def find(series_name_or_id)
#find_mutex.synchronize do
if series_name_or_id.is_a?(String)
if doc = search_xml_document(series_name_or_id)
if doc.xpath('//SeriesName').try(:first).try(:content) == series_name_or_id
#find_mutex.unlock
series = find(doc.xpath('//seriesid').first.content.to_i)
#find_mutex.lock
return series
end
end
elsif series_name_or_id.is_a?(Integer)
if doc = xml_document(series_name_or_id)
Series.new(doc)
end
end
end
end
end
end
Without lines 9 and 11, there's a recursive mutex lock: deadlock error (which makes enough sense… therefore my question is, may I release and re-lock the mutex. (I re-lock, so that when synchronize exits, I won't get an error unlocking a mutex that I don't own… but I haven't tested if this is required)
Is this a sane implementation, or would I be better served having find() call two individual methods, each protected with their own mutex? (example find_by_id, and find_by_name)
What I have now works (or at least appears to work).
Finally, bonus points for - how would I test such a method for safety?

This doesn't look good to me, as #find_mutex.unlock will allow another method(s) to enter at the same time. Also, I don't think using recursion is usual for this kind of method dispatch - actually you have two methods stuffed into one. I would certainly separate these two, and if you want to be able to call one method with different argument types, just check the argument's type and invoke one or the other. If you don't need to expose find_by_id and find_by_name, you can make them private, and put mutex.synchronize only in find.

Related

Better method for dynamically calling methods from input?

Is it better to use case/when things or the send method when dynamically calling methods based on user input? "better" based primarily on good coding practices.
input = gets.chomp
case input
when foo
foo
when bar
bar
end
versus
input = gets.chomp #Where hopefully the input would be 'foo' or 'bar'
send(input)
Your wording makes the question incredibly hard to read.
If I understood you correctly, you want to call methods based on user input. One alternative would be to check every possible value and call a method, the other - to use send directly.
First of all, notice that in your first example, you were calling method1 when the user entered foo. If you used send(input) you would have called foo instead. So they are not exactly the same.
You can achieve the same behavior by putting the input->method mapping in a hash like so:
dispatch = {foo: :method1, bar: :method2}
input = gets.chomp.to_sym
send(dispatch[input])
Another thing to note is that send in the original situation would call any method passed. You can instead whitelist the possible methods with the hash above and checking if such value exists:
send(dispatch[input]) if dispatch.key? input
Now to the question of when to use one or the other:
If you have 2, 3, 5 or so possibilities, prefer explicitly listing them. It will be faster, easier to read, easier to do static code analysis and so on.
If you have hundreds and thousands of different methods, prefer send. The costs outweigh the benefits of being DRY.
If the list of allowed methods is generated dynamically, you don't have a choice - use send. Examples:
You want to call methods to a given object and that object is different each time
You want to allow different methods depending on the user's permissions
You want to implement a REPL or some other awesome tool that has extremely dynamic needs
In general, don't use meta programming, unless there is significant gain or you don't have any other choice.
Unless you'd like your user to be able to call any method in the method lookup chain, including private methods which send can invoke, it probably makes sense for you to lock it down and only allow your users some methods.
If you don't specify an object to send to (like in your code above), Ruby will look at self for a method by that name and then use a normal method lookup. In other words self will be the first link in the method lookup chain. If you do specify an object, maybe an object that you create for that purpose for example, another option might be to use the methods like try or respond_to?.
input = gets.chomp
if defined?(input.to_sym)
send(input)
else
puts "No such thing!"

Can I send a signal to a method I call indirectly without changing the methods in between?

I have a view which aggregates the result of running a complex computation several times with different inputs. This computation relies on a couple of base methods which are sometimes, but not always, very expensive (on a cache miss). The following is a working example of roughly this situation:
def view
10.times.map do
complicated_process_with_many_methods
end
end
def complicated_process_with_many_methods
[sometimes_expensive_method, sometimes_expensive_method]
end
def sometimes_expensive_method
if rand < 0.5
"expensive result"
else
"inexpensive result"
end
end
puts view
However, sometimes the user just wants to see whatever data we can fetch quickly, and doesn't want to wait a long time for the data that would have to be computed. That is, we should render as much of view as we can without depending on an expensive invocation of sometimes_expensive_method. sometimes_expensive_method can easily determine if this will be an expensive invocation or not, but the information required to do so should not be available to view or to complicated_process_with_many_methods.
I would like to be able to call complicated_process_with_many_methods in such a way that, if ever sometimes_expensive_method turns out to be expensive, we give up and use a default value for that invocation of complicated_process_with_many_methods. However, if nothing turns out to be expensive, then the real result of complicated_process_with_many_methods should be used.
I could add a flag which causes sometimes_expensive_method to behave differently when the result would be expensive:
def sometimes_expensive_method(throw_on_miss)
if rand < 0.5
if throw_on_miss
raise ExpensiveResultException
else
"expensive result"
end
else
"inexpensive result"
end
end
then catch that exception in view and use a default value instead. However, this would require passing that flag through all of the different methods involved in complicated_process_with_many_methods, and I feel like those methods should not have to know about this behavior. I could instead have sometimes_expensive_method call a continuation when the result would be expensive, but that continuation would also have to be passed through all the calls. I don't think a global would help, because different requests may invoke the different behaviors at the same time.
What I'm looking for is some way for sometimes_expensive_method to change its behavior (perhaps by throwing an exception) in response to some sort of signal that is defined in view, without modifying the body of complicated_process_with_many_methods to pass the signal along. Does such a signal exist in ruby?
Try Timeout. See here.
I don't know if it will release the external resources (files or db connections), so use it at your own risk.

In Ruby, why wrap "yield" in a call you are making anyway?

I am new to Ruby. I am confused by something I am reading here:
http://alma-connect.github.io/techblog/2014/03/rails-pub-sub.html
They offer this code:
# app/pub_sub/publisher.rb
module Publisher
extend self
# delegate to ActiveSupport::Notifications.instrument
def broadcast_event(event_name, payload={})
if block_given?
ActiveSupport::Notifications.instrument(event_name, payload) do
yield
end
else
ActiveSupport::Notifications.instrument(event_name, payload)
end
end
end
What is the difference between doing this:
ActiveSupport::Notifications.instrument(event_name, payload) do
yield
end
versus doing this:
ActiveSupport::Notifications.instrument(event_name, payload)
yield
If this were another language, I might assume that we first call the method instrument(), and then we call yield so as to call the block. But that is not what they wrote. They show yield being nested inside of ActiveSupport::Notifications.instrument().
Should I assume that ActiveSupport::Notifications.instrument() is returning some kind of iterable, that we will iterate over? Are we calling yield once for every item returned from ActiveSupport::Notifications.instrument()?
While blocks are frequently used for iteration they have many other uses. One is to ensure proper resource cleanup, for example
ActiveRecord::Base.with_connection do
...
end
Checks out a database connection for the thread, yields to the block and then checks the connection back in.
In the specific case of the instrument method you found what it does is add to the event data it is about to broadcast information about the time it's block took to execute. The actual implementation is more complicated but in broad terms it's not so different to
event = Event.new(event_name, payload)
event.start = Time.now
yield
event.end = Time.now
event
The use of yield allows it to wrap the execution of your code with some timing code. In your second example no block is passed to instrument, which detects this and will record it as an event having no duration
The broadcast_event method has been designed to accept an optional block (which allows you to pass a code block to the method).
ActiveSupport::Notifications.instrument also takes an optional block.
Your first example simply takes the block passed in to broadcast_event and forwards it along to ActiveSupport::Notifications.instrument. If there's no block, you can't yield anything, hence the different calls.

Calling original new() method from within overridden new()?

I have a class whose initialize method gets data from a remote source, which it then uses to set its object attributes.
I'm expecting this class to be heavily used, possibly hundreds of times within a program. I'd like to reduce the overhead network calls by caching objects that have already been instantiated, which I'd then return to the user when they ask to instantiate that object again.
For that, I've been considering overriding the new method for this class. It would check to see if the cached object is available, and if so, return that reference. Otherwise, it would call the regular new method for the object, which would allocate memory and call initialize like usual.
If I override the new() method in a class, is it possible to call the original new method via something like super()?
Yes, super without parameters will call the parent method with the same parameters passed to the new method.
Or, you can cherry-pick parameters by adding them to super(p1, p2, ...).
Regarding what you want to do by remembering previous invocations, that's called "memoizing" and there is at least one memoize gem for it, or, you can write your own, depending on your needs.
It's pretty easy to do using a hash. Use the parameters used when invoking the new method as the key, and the value is the instance you want to return. Without examples of your code it's hard to come up with an example that's custom-fit, but this is a simple, untested, version:
def initialize(*args)
#memoizer ||= {}
return #memoizer[args] if #memoizer[args]
# do what you will with the args in this initializer,
# then create a new instance for the future.
#memoizer[args] = super(args)
end
The idea is that #memoizer remembers the "arity" of the call and automatically returns the result of similar calls. If that set of parameters haven't been seen before it'll compute and create the new instance and then return it.
This breaks down when the result could change with the same set of input parameters. You don't want to memoize database calls or anything using random or a date/time value, or that returns something outside your control. Trying to use it in those cases will return stale or wrong values unless you design in a method to sweep through and revalidate the #memoizer values periodically.
Also, there is no flushing mechanism, so #memoizer will only grow in size, and could possibly consume all available space given enough different input values. To deal with that you could also have a timestamp for when the value was added to #memoizer, and periodically purge entries that exceed a given lifetime. Only "live" values would remain in the hash then.
Useful information is at: "Memoize Techniques in Ruby and Rails".
either via super or via an alias_method chain
super is better if you sub-class
Check this: When monkey patching a method, can you call the overridden method from the new implementation?

(J)Ruby - safe way to redefine core classes?

What is the safe way to redefine methods in core classes like File, String etc. I'm looking for implementing something similar to the Java Security Manager in (J)Ruby.
I'm looking for a way to redefine a method by first seeing which class/script has called this method and if that class/script belong to a list of some blacklisted classes (that I keep track of) I want to raise an exception, If the calling class belong to a non-blacklisted class then allow the operation. something like:
class String
alias_method :old_length, :length
def length
if(#nowHowDoIGetTheCallingClass)
raise "bad boy"
else
old_length
end
end
I tried this in JRuby, but this works only alternatively. One time the new length method is called and next time the old length method is called. I guess the alias doesn't work properly in JRuby! >.<
If something works sometimes, but not at other times, it's more likely to be your code that's the problem, not JRuby. Select isn't broken

Resources