It seems a lot of libraries/plugins use this syntax:
def self.included(base) # :nodoc:
base.extend ClassMethods
end
Why is the :nodoc: part necessary?
It is not necessary. If applied to a class, it just suppresses documentation (rdoc) for all the methods in the Class extension. Described in Programming Ruby as:
:nodoc: -
Don't include this element in
the documentation. For classes and
modules, the methods, aliases,
constants, and attributes directly
within the affected class or module
will also be omitted from the
documentation. By default, though,
modules and classes within that class
or module will be documented.
I don't think it's necessary. Actually, in my opinion, it's one of the most useless features of RDoc.
So many times I've seen it while reading a libarie's code and I had to ask myself "Why?". I don't see any reason to use this feature. If you don't want people to use your method, just make it private. It's a big hassle when reading documentation and seeing a method call to a method that's left out of the documentation.
Related
I sometimes write modules that will only contain module methods (as opposed to module instance methods) (are there better names for these?). These modules should not be included in classes because that would have no effect and be misleading to a reader. So I'd like it to be as clear as possible to the reader that these modules contain no instance methods.
If I define all methods with .self, then a reader has to inspect all methods to ensure that this module contains no instance methods. If I instead use class << self or extend self then it is automatic; as soon as the reader sees this, they know.
I think extend self is best becuase with class << self one has to find its corresponding end; that is, it may not apply to all methods in the module.
So is it a good idea, and a best practice, to use extend self in cases like this?
Also, is there any difference at runtime between enclosing all methods in class << self as opposed to using extend self?
I sometimes write modules that will only contain module methods (as opposed to module instance methods) (are there better names for these?).
Singleton, meaning a class with a single instance. Here that "single instance" is the Module instance.
If I define all methods with .self, then a reader has to inspect all methods to ensure that this module contains no instance methods
The module's documentation should make this clear. If the user of a module has to study the code to understand your module, that is a documentation failure.
What does extend self do?
So I'd like it to be as clear as possible to the reader that these modules contain no instance methods.
extend self does the opposite. It makes all the instance methods also be class methods. It's equivalent to YourModule.extend(YourModule).
module YourModule
def some_method
23
end
extend self
end
Is the same as...
module YourModule
def some_method
23
end
end
YourModule.extend(YourModule)
Which is similar to...
module YourModule
def some_method
23
end
def self.some_method
23
end
end
Why would you do this? To allow both...
YourModule.some_method
and also...
class SomeClass
extend YourModule
end
SomeClass.some_method
There are edge cases where you might want this, but for general use I would argue this is an anti-pattern. The first is using a module as a singleton, the second is using the module as a mixin or trait. These are two rather different design goals for a module. Trying to be both will compromise the design of both.
Pros and Cons.
Since the primary use case of being both a singleton and a mixin is an anti-pattern, I would argue use class << self, with def self.method occasionally, and module_function and extend self never.
class << self
Pros
All class definitions are grouped together.
The block scope makes it clear what affects the class and what affects the instances.
Indentation makes it clear what is in the block.
IDEs can clearly identify what is in the block.
It allows using normal declarations like attr_accessor on the class.
It is documented.
It is common.
Rubocop approved.
Cons
When looking at an individual method, it's not as obvious as def self.method.
def self.method
Pros
It's obvious it's a class method from looking at the method.
It is documented.
It is common.
Rubocop approved.
Cons
You might forget to add the self..
It allows mixing of class and instance methods making the reader hunt through the code.
It does not help using attr_accessor and friends on the class.
extend self
Pros
It allows your module to act as both a singleton (YourModule.method) and a mixin (extend YourModule)... which is also a con.
Cons
It is obscure; many (most?) won't know to look for it or what it means if they find it.
It is not documented (or if it is, I can't find it).
Individual methods look like instance methods.
It can appear anywhere in the module, and there's no consensus where it should go, making it action at a distance.
It affects the meaning of code before it, the one case I can think of this in Ruby, further making it action at a distance.
Rubocop prefers module_function to extend self, though doesn't explain why. For my guesses, see below.
It allows your module to act as both a singleton (YourModule.method) and a mixin (extend YourModule). Those are two rather different use cases making this an anti-pattern.
module_function
I've never heard of this either, but it came up when searching for extend self. I would also say to never use this, use class << self, but it's better than extend self.
Pros
It's at least mentioned in the Modules and Classes documentation.
It's documented.
It works like private in that it affects all methods below it (though this is also a con, see below).
If there are to be no instance methods, it must appear at the top of the module.
Rubocop approved.
Cons
It is obscure; many (most?) won't know to look for it or what it means when they find it.
Individual methods look like instance methods.
It affects the meaning of distant code after it making it action at a distance.
I don't see why it should matter how you decide to define the module methods. Consider simply raising an exception if the module is included in another module (which may be a class). You can do that with the callback (a.k.a. "hook") method Module#included. Here's an example.
module M
# This module is not to be included in a class because
# it contains no instance methods.
def self.included(klass)
raise "\nYou intended to include this module in #{klass}. You must be out of\nyour mind! It does no harm but there is no point in doing so\nbecause this module contains no instance methods. Duh!"
end
def self.hi
puts "Hi, guys"
end
end
M.hi
Hi, guys
class C
include M
end
RuntimeError:
You intended to include this module in C. You must be out of
your mind! It does no harm but there is no point in doing so
because this module contains no instance methods. Duh!
I read a blog post that recommends namespacing your monkey patches so they can be easily viewed and included.
For example:
module CoreExtensions
module DateTime
module BusinessDays
def weekday?
!sunday? && !saturday?
end
end
end
end
Would go in the: lib/core_extensions/class_name/group.rb file.
It can be included in the DateTime class with the Module#include instance method (which a class inherits because a Class is a Module)
# Actually monkey-patch DateTime
DateTime.include CoreExtensions::DateTime::BusinessDays
My question is where do the include statements go? Is there a convention?
For example:
I have the following monkey patches:
# http://www.justinweiss.com/articles/3-ways-to-monkey-patch-without-making-a-mess/
module CoreExtensions
module String
module Cases
def snakecase
return self if self !~ /[A-Z]+.*/
# http://rubular.com/r/afGWPWLRBB
underscored = gsub(/(.)([A-Z])/, '\1_\2')
underscored.downcase
end
def camelcase
return self if self !~ /_/ && self =~ /[A-Z]+.*/
split('_').map{ |e| e.capitalize }.join
end
end
end
end
That live inside the lib/core_extensions/string/cases.rb file.
Where should I put my String.include CoreExtensions::String::Cases statement?
Also to be clear this is just a ruby project, does that make a difference?
I've tried putting it inside lib/devify.rb
require 'devify/version'
require 'devify/some_dir'
require 'devify/scaffold'
require 'devify/tree_cloner'
require 'devify/renderer'
require 'devify/project'
require 'devify/tasks/some_task'
require 'devify/tasks/bootstrap'
require 'core_extensions/string/cases'
module Devify
String.include CoreExtensions::String::Cases
end
This works and it makes sense why it works. It's because my entire app lives inside the Devify module or namespace.
This way is also good because I'm not polluting the global namespace correct? Because I'm only monkey patching Strings that live inside Devify?
Just not sure not if this is the right way to go about it.
It doesn't matter where you put the include call.
Calling String.include will always monkey patch the one String class that is used by all the strings in the entire object space. So best put the instruction at the top level as to not mislead readers of the code.
Monkey patching is always global.
It is a powerful feature and can be used for good.
If you are authoring a gem be aware that you're sharing a global namespace with others. The same is also true for top-level modules and even the gem name though. Shared namespaces are just a reality of shared code.
If you are looking for lexically scoped monkey patches look into the new refinement feature that was introduce with Ruby 2.
Refinements are an idea taken from Smalltalk's class boxes. Refinements are not without their own issues though, for example they lack support for introspection and reflection. Thus essentially making them stealth and unfit for production use.
If you are looking to limit the monkey patches to some string object only, consider either subclassing String or calling extend on an instance.
Although ruby offers many ways of changing the content of a class or a method dynamically, the monkey patching can lead to big problems and strange bugs. I read this post (http://www.virtuouscode.com/2008/02/23/why-monkeypatching-is-destroying-ruby/) about why it´s a bad idea to use monkey-patching.
In summary, many things that he says make sense. When you create a monkey-patching, you are assuming that it will only works at that project, and, maybe you can create collisions and unprevisible side-effects when more libraries with similar purposes are put together.
There are cases where the benefits of the monkey-patching were awesome, like ActiveSupport way of dealing with dates manipulation by monkey-patching the Fixnum class with the ago or from_now methods, or as the method to_json. However, monkey patching should be avoided.
The point is: Ruby is an object-oriented language, and you can achieve your objectives using object composition, or any other patterns. Monkey-patching, at some way, leads you in the opposite direction of object oriented philosophy, since you add more responsibilities to an pre-existent class and increases it's public interface to serve a new funcionallity.
Besides, it's not explicit the behavior of that class and the public methods available. You cannot know, by looking at the class definition, what it makes and what is it's role at the system, and how it interact with other objects. It makes a simple task much more harder at the end.
Monkey patching makes everything much more smaller and simpler, apparently, but avoiding it makes your code much more maintanable, easier to debug, read and test, and much more elegant, since it is compliant to the "OOP" patterns.
I'm working on an internal Ruby DSL and to make it look as pretty as possible I need to monkey patch the Symbol class and add some operators. I want to be responsible in how I do this and would like to limit the scope and lifetime of the patches to a specific block of code. Is there a standard pattern for doing this? Here's some pseudo-code to show what I'm thinking:
class SomeContext
def self.monkey_patch_region(&block)
context = SomeContext.new
context.monkey_patch_Symbol
context.instance_eval(&block)
context.unmonkey_patch_Symbol
end
# magical method
def monkey_patch_Symbol
#...
end
# another magical method
def unmonkey_patch_Symbol
#...
end
end
I believe, that you're looking for ruby refinements. The feature has landed in ruby trunk, but it might be reverted before 2.0
I've heard about mixology gem. It was designed to mixin and unmix modules. Maybe it can be useful to monkey and unmonkey patches.
UPDATE: mixology won't help you, as it (un)mixes modules to objects (as with extend), not to classes (as with include), and you want monkey/unmonkey core classes, not their objects individually. Anyway I intend to maintain this answer as possibly useful reference for someone else.
In ruby, I understand that module functions can be made available without mixing in the module by using module_function as shown here. I can see how this is useful so you can use the function without mixing in the module.
module MyModule
def do_something
puts "hello world"
end
module_function :do_something
end
My question is though why you might want to have the function defined both of these ways.
Why not just have
def MyModule.do_something
OR
def do_something
In what kind of cases would it be useful to have the function available to be mixed in, or to be used as a static method?
Think of Enumerable.
This is the perfect example of when you need to include it in a module. If your class defines #each, you get a lot of goodness just by including a module (#map, #select, etc.). This is the only case when I use modules as mixins - when the module provides functionality in terms of a few methods, defined in the class you include the module it. I can argue that this should be the only case in general.
As for defining "static" methods, a better approach would be:
module MyModule
def self.do_something
end
end
You don't really need to call #module_function. I think it is just weird legacy stuff.
You can even do this:
module MyModule
extend self
def do_something
end
end
...but it won't work well if you also want to include the module somewhere. I suggest avoiding it until you learn the subtleties of the Ruby metaprogramming.
Finally, if you just do:
def do_something
end
...it will not end up as a global function, but as a private method on Object (there are no functions in Ruby, just methods). There are two downsides. First, you don't have namespacing - if you define another function with the same name, it's the one that gets evaluated later that you get. Second, if you have functionality implemented in terms of #method_missing, having a private method in Object will shadow it. And finally, monkey patching Object is just evil business :)
EDIT:
module_function can be used in a way similar to private:
module Something
def foo
puts 'foo'
end
module_function
def bar
puts 'bar'
end
end
That way, you can call Something.bar, but not not Something.foo. If you define any other methods after this call to module_function, they would also be available without mixing in.
I don't like it for two reasons, though. First, modules that are both mixed in and have "static" methods sound a bit dodgy. There might be valid cases, but it won't be that often. As I said, I prefer either to use a module as a namespace or mix it in, but not both.
Second, in this example, bar would also be available to classes/modules that mix in Something. I'm not sure when this is desirable, since either the method uses self and it has to be mixed in, or doesn't and then it does not need to be mixed in.
I think using module_function without passing the name of the method is used quite more often than with. Same goes for private and protected.
It's a good way for a Ruby library to offer functionality that does not use (much) internal state. So if you (e.g.) want to offer a sin function and don't want to pollute the "global" (Object) namespace, you can define it as class method under a constant (Math).
However, an app developer, who wants to write a mathematical application, might need sin every two lines. If the method is also an instance method, she can just include the Math (or My::Awesome::Nested::Library) module and can now directly call sin (stdlib example).
It's really about making a library more comfortable for its users. They can choose themself, if they want the functionality of your library on the top level.
By the way, you can achieve a similar functionality like module_function by using: extend self (in the first line of the module). To my mind, it looks better and makes things a bit clearer to understand.
Update: More background info in this blog article.
If you want to look at a working example, check out the chronic gem:
https://github.com/mojombo/chronic/blob/master/lib/chronic/handlers.rb
and Handlers is being included in the Parser class here:
https://github.com/mojombo/chronic/blob/master/lib/chronic/parser.rb
He's using module_function to send the methods from Handlers to specific instances of Handler using that instance's invoke method.
I'm writing some code in which I'd like to add some methods to a predefined class, like so:
class Model # this class already exists
def my_method
# code here
end
end
Is there any way to namespace this, using Modules or otherwise?
There will be a mechanism to do this in Ruby 2.0, although it is not exactly clear what exactly that mechanism is going to be. For the past almost 10 years, the frontrunner seemed to be Selector Namespaces, but recently Classboxes and even more recently, Refinements have taken the lead. In fact, if I am not mistaken, Refinements are actually currently implemented in the YARV trunk.
With all currently existing versions (including the soon to be released 1.9.3), however, there is no way to do this.
That's one of the reasons why monkey patching should generally be avoided.
Just write your methods inside a Module and include it in Model:
module SomeModule
def my_method
end
end
class Model
include SomeModule
end