Why is using a class variable in Ruby considered a 'code smell'? - ruby

According to Reek, creating a class variable is considered a 'code smell'. What is the explanation behind this?

As you can find in their documentation on Class Variables:
Class variables form part of the global runtime state, and as such make it easy for one part of the system to accidentally or inadvertently depend on another part of the system. So the system becomes more prone to problems where changing something over here breaks something over there. In particular, class variables can make it hard to set up tests (because the context of the test includes all global state).
Essentially, it's a manifestation of global state, which is almost universally considered evil, because it makes tests more difficult and results in a much more fragile class/program structure.
This Stack Overflow question may also be worth reading, which shows the main problem with class variables: if any class inherits from your class and modifies the class variable, every instance of that variable changes, even from the parent! This understandably gives you a way to shoot yourself in the foot easily, so it may be best to avoid them unless you're very careful.
It's also worth comparing class variables with class instance variables. This question has a few good examples which illustrate the usage differences, but in essence class variables are shared, whereas class instance variables are not shared. Therefore, to avoid unwanted side effects, class instance variables are almost always what you want.

In brief, this:
class Shape
##sides = 0
def self.sides
##sides
end
end
class Pentagon < Shape
##sides = 5
end
puts Shape.sides # oops ... prints 5

Related

Static global C-like variables in Ruby

Does Ruby has static global variables?
With this I mean global variables only accessible from the file where they were defined.
Short answer: No.
The long answer is more complicated.
There's only one global namespace in Ruby and any alterations to it from any code will have the effect of changing it for all code. To keep things local you need to scope them to a particular context, typically module or class. For example:
module PrivateStuff
#private_variable = "Private (mostly)"
def self.expose_private_variable
#private_variable
end
end
Note this doesn't prevent others from accessing your private variables using instance_variable_get or techniques like that.
This usually isn't a big deal since global variables are usually a sign of bad design and should be avoided unless there's no alternative, a case that's exceedingly rare.
Unlike compiled languages which enforce very strict rules when it comes to data access, Ruby leaves it up to the programmer to be disciplined and simply not do it in the first place.

Is "extend self" an anti-pattern for utility modules? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Steve Klabnik recently said in a pull request for a utility module:
[The code] obscures the fact that these are class methods, and we
want to use them that way. Plus, I think that extend self is generally
an anti-pattern, and shouldn't really be used except in some cases. I
thought about it, and I think this is one of those cases.
When creating a utility module (e.g., for math) what's the best way to declare the methods?
And when is the extend self idiom ever appropriate?
It's not exactly helpful that he doesn't say 1) why he thinks it is an anti-pattern (other than obscuring the fact that you are defining class methods) or 2) why, having thought about it, this is one of the cases he thinks it shouldn't be used, so it's hard to specifically counter any of his arguments.
However, I'm not convinced that extend self is an anti-pattern. A utility module seems like a good example of a use case for it. I've also used it as easy storage for test fixtures.
I think it's worth examining what extend self is, what possible problems there might be with it, and what alternatives there are.
At it's core, it is simply a way to avoid having to write self. before every method definition in a module that you never intend to mix into a class, and that will therefore never have an 'instance' of itself created, so can by definition only have 'class' methods (if you want to be able to call them, that is).
Does it disguise the fact that you intend the methods to be used as class methods? Well, yes, if you don't look at the top of the file to where it says extend self, that's possible. However, I would argue that if it's possible for you to make this confusion, your class is probably too complicated anyway.
It should be obvious from your class - from its name and from its contents - that it is intended as a collection of utility functions. Ideally it wouldn't be much more than a screen tall anyway, so extend self would almost never be out of sight. And as we'll see, the alternatives also suffer almost exactly the same problem.
One alternative would be to use class << self like this:
module Utility
class << self
def utility_function1
end
def utility_function2
end
end
end
I am not a fan of this, not least because it introduces an extra layer of indentation. It's also ugly (totally subjective, I know). It also suffers from exactly the same problem of 'obscuring' the fact that you're defining class methods.
You're also free, using this approach, to define instance methods outside the class << self block - which might lead to the temptation to do so (although I'd hope it wouldn't), so I'd argue that extend self is superior in this regard by removing the possibility of this muddying of the waters.
(The same is true, of course, of the 'long-hand' style of using def self.utility_function.)
Another approach could be to use a singleton object. I don't think this is a good idea at all, because a singleton object is an object for a reason - it's meant to hold state and do stuff, but also be the only one in existence. That simply doesn't make sense for a utility module, which should be a series of independent stateless functions. You don't want MathUtils.cos(90) to ever return a different value based on internal state of MathUtils, right? (I know that you can of course hold state in a module and do all these things, but it's more of a semantic division for me than a technical one).
It also leads to the same problem of arguably obscuring the fact that the methods are intended to be called as class methods (sort of). They are defined as instance methods, and you call them as instance methods, but by first getting the single instance of the class by calling the class method instance.
class MathSingleton
include Singleton
def cos x
end
end
MathSingleton.instance.cos x
This would be a terrible alternative to extend self for this purpose. But look, also, the only thing indicating that these methods are to be used as methods on the singleton instance is that one line just up at the top, just like extend self.
So what other possible downsides are there? I don't know of any, but I'd be interested to hear them if anyone else does.
I would argue that extend self leads to shorter code, that leaves out the extraneous self.s and allows you to concentrate on the names and therefore meanings of its methods.
It also has the nice property that, if you are writing another class that uses lots of your utility functions, for example, you can just mix it in and they will be available without having to use the module name each time. Much like static imports work in other languages.
Utility modules, as opposed to mixins, are containers that wrap constants and methods with some common concern. Mixins, such as this one,
module SingingCapability
def sing; puts "I'm singing!" end
end
Human = Class.new
Fred = Human.new.tap { |o| o.extend SingingCapability }
generally pose some requirements on their includers. That is, generally only certain objects are good candidates to include or extend a given mixin. Hypothetically, it is possible that a module is at the same time a utility module, and a mixin. And if the module itself belongs among eligible candidates to be extended by it, then go ahead and extend it.
In sum, I do think it's somewhat not a very good practice, but Ruby defies me on this one since we even have Module#module_function method to facilitate this malpractice:
module SingingBox
def sing; "Tralala!" end
module_function :sing
end
SingingBox.sing #=> "Tralala!"

How can I determine the size of methods and classes in Ruby?

I'm working on a code visualization tool and I'd like to be able to display the size(in lines) of each Class, Method, and Module in a project. It seems like existing parsers(such as Ripper) could make this info easy to get. Is there a preferred way to do this? Is there a method of assessing size for classes that have been re-opened in separate locations? How about for dynamically (Class.new {}, Module.new {}) defined structures?
I think what you're asking for is not possible in general without actually running the whole Ruby program the classes are part of (and then you run into the halting problem). Ruby is extremely dynamic, so lines could be added to a class' definition anywhere, at any time, without necessarily referring to the particular class by name (e.g. using class_eval on a class passed into a method as an argument). Not that the source code of a class' definition is saved anyway... I think the closest you could get to that is the source_locations of the methods of the class.
You could take the difference of the maximum and minimum line numbers of those source_locations for each file. Then you'd have to assume that the class is opened only once per file, and that the size of the last method in a file is negligible (as well as any non-method parts of the class definition that happen before the first method definition or after the last one).
If you want something more accurate maybe you could run the program, get method source_locations, and try to correlate those with a separate parse of the source file(s), looking for enclosing class blocks etc.
But anything you do will most likely involve assumptions about how classes are generally defined, and thus not always be correct.
EDIT: Just saw that you were asking about methods and modules too, not just classes, but I think similar arguments apply for those.
I've created a gem that handles this problem in the fashion suggested by wdebaum. class_source. It certainly doesn't cover all cases but is a nice 80% solution for folks that need this type of thing. Patches welcome!

When to use RSpec let()?

I tend to use before blocks to set instance variables. I then use those variables across my examples. I recently came upon let(). According to RSpec docs, it is used to
... to define a memoized helper method. The value will be cached across multiple calls in the same example but not across examples.
How is this different from using instance variables in before blocks? And also when should you use let() vs before()?
I always prefer let to an instance variable for a couple of reasons:
Instance variables spring into existence when referenced. This means that if you fat finger the spelling of the instance variable, a new one will be created and initialized to nil, which can lead to subtle bugs and false positives. Since let creates a method, you'll get a NameError when you misspell it, which I find preferable. It makes it easier to refactor specs, too.
A before(:each) hook will run before each example, even if the example doesn't use any of the instance variables defined in the hook. This isn't usually a big deal, but if the setup of the instance variable takes a long time, then you're wasting cycles. For the method defined by let, the initialization code only runs if the example calls it.
You can refactor from a local variable in an example directly into a let without changing the
referencing syntax in the example. If you refactor to an instance variable, you have to change
how you reference the object in the example (e.g. add an #).
This is a bit subjective, but as Mike Lewis pointed out, I think it makes the spec easier to read. I like the organization of defining all my dependent objects with let and keeping my it block nice and short.
A related link can be found here: http://www.betterspecs.org/#let
The difference between using instances variables and let() is that let() is lazy-evaluated. This means that let() is not evaluated until the method that it defines is run for the first time.
The difference between before and let is that let() gives you a nice way of defining a group of variables in a 'cascading' style. By doing this, the spec looks a little better by simplifying the code.
I have completely replaced all uses of instance variables in my rspec tests to use let(). I've written a quickie example for a friend who used it to teach a small Rspec class: http://ruby-lambda.blogspot.com/2011/02/agile-rspec-with-let.html
As some of the other answers here says, let() is lazy evaluated so it will only load the ones that require loading. It DRYs up the spec and make it more readable. I've in fact ported the Rspec let() code to use in my controllers, in the style of inherited_resource gem. http://ruby-lambda.blogspot.com/2010/06/stealing-let-from-rspec.html
Along with lazy evaluation, the other advantage is that, combined with ActiveSupport::Concern, and the load-everything-in spec/support/ behavior, you can create your very own spec mini-DSL specific to your application. I've written ones for testing against Rack and RESTful resources.
The strategy I use is Factory-everything (via Machinist+Forgery/Faker). However, it is possible to use it in combination with before(:each) blocks to preload factories for an entire set of example groups, allowing the specs to run faster: http://makandra.com/notes/770-taking-advantage-of-rspec-s-let-in-before-blocks
It is important to keep in mind that let is lazy evaluated and not putting side-effect methods in it otherwise you would not be able to change from let to before(:each) easily.
You can use let! instead of let so that it is evaluated before each scenario.
In general, let() is a nicer syntax, and it saves you typing #name symbols all over the place. But, caveat emptor! I have found let() also introduces subtle bugs (or at least head scratching) because the variable doesn't really exist until you try to use it... Tell tale sign: if adding a puts after the let() to see that the variable is correct allows a spec to pass, but without the puts the spec fails -- you have found this subtlety.
I have also found that let() doesn't seem to cache in all circumstances! I wrote it up in my blog: http://technicaldebt.com/?p=1242
Maybe it is just me?
Dissenting voice here: after 5 years of rspec I don't like let very much.
1. Lazy evaluation often makes test setup confusing
It becomes difficult to reason about setup when some things that have been declared in setup are not actually affecting state, while others are.
Eventually, out of frustration someone just changes let to let! (same thing without lazy evaluation) in order to get their spec working. If this works out for them, a new habit is born: when a new spec is added to an older suite and it doesn't work, the first thing the writer tries is to add bangs to random let calls.
Pretty soon all the performance benefits are gone.
2. Special syntax is unusual to non-rspec users
I would rather teach Ruby to my team than the tricks of rspec. Instance variables or method calls are useful everywhere in this project and others, let syntax will only be useful in rspec.
3. The "benefits" allow us to easily ignore good design changes
let() is good for expensive dependencies that we don't want to create over and over.
It also pairs well with subject, allowing you to dry up repeated calls to multi-argument methods
Expensive dependencies repeated in many times, and methods with big signatures are both points where we could make the code better:
maybe I can introduce a new abstraction that isolates a dependency from the rest of my code (which would mean fewer tests need it)
maybe the code under test is doing too much
maybe I need to inject smarter objects instead of a long list of primitives
maybe I have a violation of tell-don't-ask
maybe the expensive code can be made faster (rarer - beware of premature optimisation here)
In all these cases, I can address the symptom of difficult tests with a soothing balm of rspec magic, or I can try address the cause. I feel like I spent way too much of the last few years on the former and now I want some better code.
To answer the original question: I would prefer not to, but I do still use let. I mostly use it to fit in with the style of the rest of the team (it seems like most Rails programmers in the world are now deep into their rspec magic so that is very often). Sometimes I use it when I'm adding a test to some code that I don't have control of, or don't have time to refactor to a better abstraction: i.e. when the only option is the painkiller.
let is functional as its essentially a Proc. Also its cached.
One gotcha I found right away with let... In a Spec block that is evaluating a change.
let(:object) {FactoryGirl.create :object}
expect {
post :destroy, id: review.id
}.to change(Object, :count).by(-1)
You'll need to be sure to call let outside of your expect block. i.e. you're calling FactoryGirl.create in your let block. I usually do this by verifying the object is persisted.
object.persisted?.should eq true
Otherwise when the let block is called the first time a change in the database will actually happen due to the lazy instantiation.
Update
Just adding a note. Be careful playing code golf or in this case rspec golf with this answer.
In this case, I just have to call some method to which the object responds. So I invoke the _.persisted?_ method on the object as its truthy. All I'm trying to do is instantiate the object. You could call empty? or nil? too. The point isn't the test but bringing the object ot life by calling it.
So you can't refactor
object.persisted?.should eq true
to be
object.should be_persisted
as the object hasn't been instantiated... its lazy. :)
Update 2
leverage the let! syntax for instant object creation, which should avoid this issue altogether. Note though it will defeat a lot of the purpose of the laziness of the non banged let.
Also in some instances you might actually want to leverage the subject syntax instead of let as it may give you additional options.
subject(:object) {FactoryGirl.create :object}
"before" by default implies before(:each). Ref The Rspec Book, copyright 2010, page 228.
before(scope = :each, options={}, &block)
I use before(:each) to seed some data for each example group without having to call the let method to create the data in the "it" block. Less code in the "it" block in this case.
I use let if I want some data in some examples but not others.
Both before and let are great for DRYing up the "it" blocks.
To avoid any confusion, "let" is not the same as before(:all). "Let" re-evaluates its method and value for each example ("it"), but caches the value across multiple calls in the same example. You can read more about it here: https://www.relishapp.com/rspec/rspec-core/v/2-6/docs/helper-methods/let-and-let
Note to Joseph -- if you are creating database objects in a before(:all) they won't be captured in a transaction and you're much more likely to leave cruft in your test database. Use before(:each) instead.
The other reason to use let and its lazy evaluation is so you can take a complicated object and test individual pieces by overriding lets in contexts, as in this very contrived example:
context "foo" do
let(:params) do
{ :foo => foo, :bar => "bar" }
end
let(:foo) { "foo" }
it "is set to foo" do
params[:foo].should eq("foo")
end
context "when foo is bar" do
let(:foo) { "bar" }
# NOTE we didn't have to redefine params entirely!
it "is set to bar" do
params[:foo].should eq("bar")
end
end
end
I use let to test my HTTP 404 responses in my API specs using contexts.
To create the resource, I use let!. But to store the resource identifier, I use let. Take a look how it looks like:
let!(:country) { create(:country) }
let(:country_id) { country.id }
before { get "api/countries/#{country_id}" }
it 'responds with HTTP 200' { should respond_with(200) }
context 'when the country does not exist' do
let(:country_id) { -1 }
it 'responds with HTTP 404' { should respond_with(404) }
end
That keeps the specs clean and readable.

Best way of emulating enum in Ruby? (part two)

I'm new to Ruby so forgive me if this is something obvious..
I've made a class like so
class Element
attr_accessor :type
:type_integer
:type_string
end
(this is really just an example, not actual code)
Well, I've read Enums in Ruby and I'd prefer to go the Symbols route of having something like enumerations in other languages. I have a problem though, how can I keep my global scope clear while implementing this. What I'm wanting to be able to do is something like
e=Element.new
e.type=Element.type_integer
or something pretty simple and straight forward like that.
Symbols don't do anything to the global (or any other) scope (i.e. no variables or constants or anything else gets defined when you use symbols), so I guess the answer is: just use symbols and the global scope will be kept clear.
If you want to use e.type=Element.type_integer, while still using symbols, you could do:
class Element
def self.type_integer
:type_integer
end
end
Although I fail to see the upside vs. just using e.type = :type_integer directly.

Resources