Test cached (multiprocessed) and memoized method - ruby

Short question, I'd like to test this method foo:
class MyClass
def foo
# The `#cache_key` is defined by instance but can be the same for
# different instances (obviously I guess)
#foo ||= cache.fetch(#cache_key) do
# call a private method, tested aside
generate_foo_value || return
# The `|| return` trick is one I use to skip the yield result
# in case of falsey value. May not be related to the question,
# but I left it there because I'm not so sure.
end
end
end
The issue here is that I have two kinds of caching.
On one hand, I have #foo ||= that assure memoization and that I can test pretty easily. For instance by calling twice my method and watch if the result is different.
On the other hand, I have the cache.fetch method which may be shared between different instances. The trouble for me is here: I don't know the best way to unit test it. Should I mock my cache? Should I watch cache result? Or should I run plural instances with the same cache key?
And for this last question, I don't know how to run plural instances easily using rspec.

I don't think you need plural instances in this case. It looks like you could just set the cache value in the specs and test that the MyClass instance(s) is returning it.
before { cache.set(:cache_key, 'this is from cache') }
subject { MyClass.new(:cache_key, cache) } # you didn't provide the
# initializer, so I'll just assume
# that you can set the cache key
# and inject the cache object
specify do
expect(subject.foo).to eq 'this is from cache'
end
You can also set the expectation that generate_foo_value is not run at all in this case.
My reasoning is that: you don't need to fully emulate setting the cache in a separate process. You test only that if the cache with the key is set - your method has to return it instead of doing expensive computation.
The fact that this cache is shared between processes (like it sits in a PostgreSQL DB, or Redis or whatever) is irrelevant for the test.

Related

Ruby: understanding data structure

Most of the Factorybot factories are like:
FactoryBot.define do
factory :product do
association :shop
title { 'Green t-shirt' }
price { 10.10 }
end
end
It seems that inside the ":product" block we are building a data structure, but it's not the typical hashmap, the "keys" are not declared through symbols and commas aren't used.
So my question is: what kind of data structure is this? and how it works?
How declaring "association" inside the block doesn't trigger a:
NameError: undefined local variable or method `association'
when this would happen on many other situations. Is there a subject in compsci related to this?
The block is not a data structure, it's code. association and friends are all method calls, probably being intercepted by method_missing. Here's an example using that same technique to build a regular hash:
class BlockHash < Hash
def method_missing(key, value=nil)
if value.nil?
return self[key]
else
self[key] = value
end
end
def initialize(&block)
self.instance_eval(&block)
end
end
With which you can do this:
h = BlockHash.new do
foo 'bar'
baz :zoo
end
h
#=> {:foo=>"bar", :baz=>:zoo}
h.foo
#=> "bar"
h.baz
#=> :zoo
I have not worked with FactoryBot so I'm going to make some assumptions based on other libraries I've worked with. Milage may vary.
The basics:
FactoryBot is a class (Obviously)
define is a static method in FactoryBot (I'm going to assume I still haven't lost you ;) ).
Define takes a block which is pretty standard stuff in ruby.
But here's where things get interesting.
Typically when a block is executed it has a closure relative to where it was declared. This can be changed in most languages but ruby makes it super easy. instance_eval(block) will do the trick. That means you can have access to methods in the block that weren't available outside the block.
factory on line 2 is just such a method. You didn't declare it, but the block it's running in isn't being executed with a standard scope. Instead your block is being immediately passed to FactoryBot which passes it to a inner class named DSL which instance_evals the block so its own factory method will be run.
line 3-5 don't work that way since you can have an arbitrary name there.
ruby has several ways to handle missing methods but the most straightforward is method_missing. method_missing is an overridable hook that any class can define that tells ruby what to do when somebody calls a method that doesn't exist.
Here it's checking to see if it can parse the name as an attribute name and use the parameters or block to define an attribute or declare an association. It sounds more complicated than it is. Typically in this situation I would use define_method, define_singleton_method, instance_variable_set etc... to dynamically create and control the underlying classes.
I hope that helps. You don't need to know this to use the library the developers made a domain specific language so people wouldn't have to think about this stuff, but stay curious and keep growing.

Ruby hash with lazy keys

I have a collection of 'data endpoints'. Each endpoint has a name and can be available or unavailable. In Ruby I want to present the available endpoints as a Hash to make it easy to work with them. The difficulty is that getting information about the endpoints is costly and should be done lazily.
Some examples of how I want my object to behave:
endpoints = get_endpoints.call # No endpoint information is accessed yet
result = endpoints['name1'] # This should only query endpoint "name1"
is_available = endpoints.key? 'name2' # This should only query endpoint "name2"
all_available = endpoints.keys # This has to query all endpoints
The comments describe how the object internally makes requests to the 'data endpoints'.
It is straightforward to make a Hash that can do the first 2 lines. However I don't know how to support the last 2 lines. To do this I need a way to make the keys lazy, not just the values.
Thank you for taking a look!
You'd have to override the key? method, and do your own checking in there.
class LazyHash < Hash
def key?(key)
# Do your checking here. However that looks for your application
end
end
In my opinion, you're asking for trouble though. One of the most powerful virtues in computer science is expectability. If you're changing the behavior of something, modifying it far beyond it's intent, it doesn't serve you to continue calling it by the original name. You don't need to shoe-horn your solution into existing classes/interfaces.
Programming offers you plenty of flexibility, so you can do stuff like this (dependent on the language of course), but in that same argument, you have no reason not to simply build a new object/service with it's own API.
I recommend starting fresh with a new class and building out your desired interface and functionality.
class LazyEndpoints
def on?(name)
end
def set(name, value)
end
end
(Or something like that, the world is yours for the taking!)

Weird Ruby class initialization logic?

Some open source code I'm integrating in my application has some classes that include code to that effect:
class SomeClass < SomeParentClass
def self.new(options = {})
super().tap { |o|
# do something with `o` according to `options`
}
end
def initialize(options = {})
# initialize some data according to `options`
end
end
As far as I understand, both self.new and initialize do the same thing - the latter one "during construction" and the former one "after construction", and it looks to me like a horrible pattern to use - why split up the object initialization into two parts where one is obviously "The Wrong Think(tm)"?
Ideally, I'd like to see what is inside the super().tap { |o| block, because although this looks like bad practice, just maybe there is some interaction required before or after initialize is called.
Without context, it is possible that you are just looking at something that works but is not considered good practice in Ruby.
However, maybe the approach of separate self.new and initialize methods allows the framework designer to implement a subclass-able part of the framework and still ensure setup required for the framework is completed without slightly awkward documentation that requires a specific use of super(). It would be a slightly easier to document and cleaner-looking API if the end user gets functionality they expect with just the subclass class MyClass < FrameworkClass and without some additional note like:
When you implement the subclass initialize, remember to put super at the start, otherwise the magic won't work
. . . personally I'd find that design questionable, but I think there would at least be a clear motivation.
There might be deeper Ruby language reasons to have code run in a custom self.new block - for instance it may allow constructor to switch or alter the specific object (even returning an object of a different class) before returning it. However, I have very rarely seen such things done in practice, there is nearly always some other way of achieving the goals of such code without customising new.
Examples of custom/different Class.new methods raised in the comments:
Struct.new which can optionally take a class name and return objects of that dynamically created class.
In-table inheritance for ActiveRecord, which allows end user to load an object of unknown class from a table and receive the right object.
The latter one could possibly be avoided with a different ORM design for inheritance (although all such schemes have pros/cons).
The first one (Structs) is core to the language, so has to work like that now (although the designers could have chosen a different method name).
It's impossible to tell why that code is there without seeing the rest of the code.
However, there is something in your question I want to address:
As far as I understand, both self.new and initialize do the same thing - the latter one "during construction" and the former one "after construction"
They do not do the same thing.
Object construction in Ruby is performed in two steps: Class#allocate allocates a new empty object from the object space and sets its internal class pointer to self. Then, you initialize the empty object with some default values. Customarily, this initialization is performed by a method called initialize, but that is just a convention; the method can be called anything you like.
There is an additional helper method called Class#new which does nothing but perform the two steps in sequence, for the programmer's convenience:
class Class
def new(*args, &block)
obj = allocate
obj.send(:initialize, *args, &block)
obj
end
def allocate
obj = __MagicVM__.__allocate_an_empty_object_from_the_object_space__
obj.__set_internal_class_pointer__(self)
obj
end
end
class BasicObject
private def initialize(*) end
end
The constructor new has to be a class method since you start from where there is no instance; you can't be calling that method on a particular instance. On the other hand, an initialization routine initialize is better defined as an instance method because you want to do something specifically with a certain instance. Hence, Ruby is designed to internally call the instance method initialize on a new instance right after its creation by the class method new.

How to stub method with specific parameter (and leave calls with other parameters unstubbed) in Mocha?

This question may seem like a duplicate of this one but the accepted answer does not help with my problem.
Context
Since Rails 5 no longer supports directly manipulating sessions in controller tests (which now inherit from ActionDispatch::IntegrationTest), I am going down the dark path of mocking and stubbing.
I know that this is bad practice and there are better ways to test a controller (and I do understand their move to integration tests) but I don't want to run a full integration test and call multiple actions in a single test just to set a specific session variable.
Scenario
Mocking/stubbing a session variable is actually quite easy with Mocha:
ActionDispatch::Request::Session.any_instance.stubs(:[]).with(:some_variable).returns("some value")
Problem is, Rails stores a lot of things inside the session (just do a session.inspect anywhere in one of your views) and stubbing the :[] method obviously prevents access to any of them (so session[:some_other_variable] in a test will no longer work).
The question
Is there a way to stub/mock the :[] method only when called with a specific parameter and leave all other calls unstubbed?
I would have hoped for something like
ActionDispatch::Request::Session.any_instance.stubs(:[]).with(:some_variable).returns("some value")
ActionDispatch::Request::Session.any_instance.stubs(:[]).with(anything).returns(original_value)
but I could not find a way to get it done.
By what I see, this is a feature not available in mocha
https://github.com/freerange/mocha/issues/334
I know this does exist in rspec-mock
https://github.com/rspec/rspec-mocks/blob/97c972be57f2c060a4a7fb8a3c5700a5ede693f0/spec/rspec/mocks/stub_implementation_spec.rb#L29
One hacky way that you an do it though, is to store the original session in an object, then mock that whenever a controller receives session, it returns another mock object, and in this you may either return a mocked velue, or delegate the call to the original session
class MySession
def initialize(original)
#original = original
end
def [](key)
if key == :mocked_key
2
else
original[key]
end
end
end
let!(original_session) { controller.send(:session) }
let(:my_session) { MySession.new(original_session) }
before do
controller.stubs(:session) { my_session }
end
Guess that mocha also allows you to do block mocking, so you don't need the class, but you need that original_session to be called
But I don't see a clean way

How to reload a ruby class

In many of our classes we cache expensive operation for performance. e.g.
def self.foo
#foo ||= get_foo
end
This works great in the application, however the tests (RSpec) fail because of these memoized variables. The values from the first test are being returned in the subsequent tests, when we expect fresh values.
So the question is: how do I reload the class? Or remove all the memoized variables?
Add an after (or before) block to the example group to remove the instance variable (assuming the object in question is the subject):
after do
subject.instance_variable_set(:#foo, nil)
end
Or fix the problem. Having a memoized class instance variable is a bit of a smell since it will never change. Normal instance variables wouldn't have this issue since you'd create a new object for each test.
Build your classes and tests in such a way that the cached data remains correct or gets deleted when it is invalid. Consider adding a method to clear the cache and calling it in a rspec before block.

Resources