Is there a difference in usage between
class Helper
class << self
# ...
end
end
and
module Helper
class << self
# ...
end
end
When would you use one over the other?
The class<<self seems to be a red herring, as the only difference here is a class versus a module. Perhaps you're asking "I want to create an object that I do not intend to instantiate, but which exists only as a namespace for some methods (and possibly as a singleton with its own, global, state)."
If this is the case, both will function equally well. If there is any chance that you might want to create a derivative (another object inheriting the same methods) then you should use a class as it slightly is easier to write:
class Variation < Helper
instead of
module Helper
module OwnMethods
# Put methods here instead of class << self
end
extend OwnMethods
end
module Variation
extend Helper::OwnMethods
However, for just namespacing I would generally use a module over a class, as a class implies that instantiation will occur.
The difference between a Module and a Class is that you can make an instance of a Class, but not a Module. If you need to create an instance of Helper (h = Helper.new) then it should be a class. If not, it is probably best to remain a module. I'm not sure how the rest of your code is relevant to the question; whether you have class methods on a Module or a Class is not relevant to whether you need to create instances of that object.
Related
I sometimes write modules that will only contain module methods (as opposed to module instance methods) (are there better names for these?). These modules should not be included in classes because that would have no effect and be misleading to a reader. So I'd like it to be as clear as possible to the reader that these modules contain no instance methods.
If I define all methods with .self, then a reader has to inspect all methods to ensure that this module contains no instance methods. If I instead use class << self or extend self then it is automatic; as soon as the reader sees this, they know.
I think extend self is best becuase with class << self one has to find its corresponding end; that is, it may not apply to all methods in the module.
So is it a good idea, and a best practice, to use extend self in cases like this?
Also, is there any difference at runtime between enclosing all methods in class << self as opposed to using extend self?
I sometimes write modules that will only contain module methods (as opposed to module instance methods) (are there better names for these?).
Singleton, meaning a class with a single instance. Here that "single instance" is the Module instance.
If I define all methods with .self, then a reader has to inspect all methods to ensure that this module contains no instance methods
The module's documentation should make this clear. If the user of a module has to study the code to understand your module, that is a documentation failure.
What does extend self do?
So I'd like it to be as clear as possible to the reader that these modules contain no instance methods.
extend self does the opposite. It makes all the instance methods also be class methods. It's equivalent to YourModule.extend(YourModule).
module YourModule
def some_method
23
end
extend self
end
Is the same as...
module YourModule
def some_method
23
end
end
YourModule.extend(YourModule)
Which is similar to...
module YourModule
def some_method
23
end
def self.some_method
23
end
end
Why would you do this? To allow both...
YourModule.some_method
and also...
class SomeClass
extend YourModule
end
SomeClass.some_method
There are edge cases where you might want this, but for general use I would argue this is an anti-pattern. The first is using a module as a singleton, the second is using the module as a mixin or trait. These are two rather different design goals for a module. Trying to be both will compromise the design of both.
Pros and Cons.
Since the primary use case of being both a singleton and a mixin is an anti-pattern, I would argue use class << self, with def self.method occasionally, and module_function and extend self never.
class << self
Pros
All class definitions are grouped together.
The block scope makes it clear what affects the class and what affects the instances.
Indentation makes it clear what is in the block.
IDEs can clearly identify what is in the block.
It allows using normal declarations like attr_accessor on the class.
It is documented.
It is common.
Rubocop approved.
Cons
When looking at an individual method, it's not as obvious as def self.method.
def self.method
Pros
It's obvious it's a class method from looking at the method.
It is documented.
It is common.
Rubocop approved.
Cons
You might forget to add the self..
It allows mixing of class and instance methods making the reader hunt through the code.
It does not help using attr_accessor and friends on the class.
extend self
Pros
It allows your module to act as both a singleton (YourModule.method) and a mixin (extend YourModule)... which is also a con.
Cons
It is obscure; many (most?) won't know to look for it or what it means if they find it.
It is not documented (or if it is, I can't find it).
Individual methods look like instance methods.
It can appear anywhere in the module, and there's no consensus where it should go, making it action at a distance.
It affects the meaning of code before it, the one case I can think of this in Ruby, further making it action at a distance.
Rubocop prefers module_function to extend self, though doesn't explain why. For my guesses, see below.
It allows your module to act as both a singleton (YourModule.method) and a mixin (extend YourModule). Those are two rather different use cases making this an anti-pattern.
module_function
I've never heard of this either, but it came up when searching for extend self. I would also say to never use this, use class << self, but it's better than extend self.
Pros
It's at least mentioned in the Modules and Classes documentation.
It's documented.
It works like private in that it affects all methods below it (though this is also a con, see below).
If there are to be no instance methods, it must appear at the top of the module.
Rubocop approved.
Cons
It is obscure; many (most?) won't know to look for it or what it means when they find it.
Individual methods look like instance methods.
It affects the meaning of distant code after it making it action at a distance.
I don't see why it should matter how you decide to define the module methods. Consider simply raising an exception if the module is included in another module (which may be a class). You can do that with the callback (a.k.a. "hook") method Module#included. Here's an example.
module M
# This module is not to be included in a class because
# it contains no instance methods.
def self.included(klass)
raise "\nYou intended to include this module in #{klass}. You must be out of\nyour mind! It does no harm but there is no point in doing so\nbecause this module contains no instance methods. Duh!"
end
def self.hi
puts "Hi, guys"
end
end
M.hi
Hi, guys
class C
include M
end
RuntimeError:
You intended to include this module in C. You must be out of
your mind! It does no harm but there is no point in doing so
because this module contains no instance methods. Duh!
I'm reading my ruby book. Looking at the code below,
module Destroy
def destroy(anyObject)
#anyObject = anyObject
puts "I will destroy the object: #{anyObject}"
end
end
class User
include Destroy
attr_accessor :name, :email
def initialize(name,email)
#name = name
#email = email
end
end
my_info = User.new("Bob","Bob#example.com")
puts "So your name is: #{my_info.name} and you have email #{my_info.email}"
user = User.new("john","john#example.com")
user.destroy("blah")
I could've just created another method inside my class. Why would I want to do this? Why would I want to use a module? It's not like embedding this into other classes is any easier than just using normal inheritance.
You can think of a module and the methods and constants inside of it as more of providing utility functions and actions that you can include to other objects as you see fit. For example, if you wanted to use destroy function in the objects Foo and Bar you would do similarly:
class Foo
include Destroy
# other code below
end
class Bar
include Destroy
# other code below
end
Now any Foo or Bar object has access to all the methods or constants inside of destroy.
Modules define a namespace, a sandbox in which your methods and constants can play without having to worry about being stepped on by other methods and constants. The ruby docs goes into more depth about this and includes a good example practical of when you would want to use it as seen below:
module Debug
def whoAmI?
"#{self.type.name} (\##{self.id}): #{self.to_s}"
end
end
class Phonograph
include Debug
# ...
end
class EightTrack
include Debug
# ...
end
ph = Phonograph.new("West End Blues")
et = EightTrack.new("Surrealistic Pillow")
ph.whoAmI? » "Phonograph (#537766170): West End Blues"
et.whoAmI? » "EightTrack (#537765860): Surrealistic Pillow"
In this example, every class that includes Debug has access to the method whoAmI? and other methods and constants that Debug includes without having to redefine it for every class.
Some programming languages such as C++, Perl, and Python allow one class to inherit from multiple other classes; that is called multiple inheritance. Ruby does not support multiple inheritance. That means each class can only inherit from one other class. However, there are cases where a class would benefit by acquiring methods defined within multiple other classes. That is made possible by using a construct called module.
A module is somewhat similar to a class, except it does not support inheritance, nor instantiating. It is mostly used as a container for storing multiple methods. One way to use a module is to employ an include or extend statement within a class. That way, the class gains access to all methods and objects defined within the module. It is said that the module is mixed in the class. So, a mixin is just a module included in a class. A single module can be mixed in multiple classes, and a single class can mix in multiple modules; thus, any limitations imposed by Ruby's single inheritance model are eliminated by the mixin feature.
Modules can also be used for namespacing. That is explained in this post at the Practicing Ruby website.
You are writing a module in the same file as the class, but not necessarily inside the class, but anyway.
For me there are 3 reasons to use a module(more details here):
Modules provide a namespace and prevent name clashes.
Modules implement the mixin facility.
When you have a very complex and dense class, you can split it into
modules and include them into the main class.
Your example is fairly simple, it would indeed make more sense to write the method in the class itself, but try to imagine a complex scenario.
Let's say I have a bunch of related functions that have no persistent state, say various operations in a string differencing package. I can either define them in a class or module (using self) and they can be accessed the exact same way:
class Diff
def self.diff ...
def self.patch ...
end
or
module Diff
def self.diff ...
def self.patch ...
end
I can then do Diff.patch(...). Which is 'better' (or 'correct')?
The main reason I need to group them up is namespace issues, common function names are all used elsewhere.
Edit: Changed example from matrix to diff. Matrix is a terrible example as it does have state and everyone started explaining why it's better to write them as methods rather than answer the actual question. :(
In your two examples, you are not actually defining methods in a Class or a Module; you are defining singleton methods on an object which happens to be a Class or a Module, but could be just about any object. Here's an example with a String:
Diff = "Use me to access really cool methods"
def Diff.patch
# ...
end
You can do any of these and that will work, but the best way to group related methods is in a Module as normal instance methods (i.e. without self.):
module Diff
extend self # This makes the instance methods available to the Diff module itself
def diff ... # no self.
def patch ...
end
Now you can:
use this functionality from within any Class (with include Diff) or from any object (with extend Diff)
an example of this use is the extend self line which makes it possible to call Diff.patch.
even use these methods in the global namespace
For example, in irb:
class Foo
include Diff
end
Foo.new.patch # => calls the patch method
Diff.patch # => also calls Diff.patch
include Diff # => now you can call methods directly:
patch # => also calls the patch method
Note: the extend self will "modify" the Diff module object itself but it won't have any effect on inclusions of the module. Same thing happens for a def self.foo, the foo won't be available to any class including it. In short, only instance methods of Diff are imported with an include (or an extend), not the singleton methods. Only subclassing a class will provide inheritance of both instance and singleton methods.
When you actually want the inclusion of a module to provide both instance methods and singleton methods, it's not completely easy. You have to use the self.included hook:
module Foo
def some_instance_method; end
module ClassMethods
def some_singleton_method; end
end
def self.included(base)
base.send :extend, ClassMethods
end
def self.will_not_be_included_in_any_way; end
end
class Bar
include Foo
end
# Bar has now instance methods:
Bar.new.some_instance_method # => nil
# and singleton methods:
Bar.some_singleton_method # => nil
The main difference between modules and classes is that you can not instantiate a module; you can't do obj = MyModule.new. The assumption of your question is that you don't want to instantiate anything, so I recommend just using a module.
Still you should reconsider your approach: rather than using arrays of arrays or whatever you are doing to represent a Matrix, it would be more elegant to make your own class to represent a matrix, or find a good class that someone else has already written.
Ruby Modules are used to specify behaviour, pieces of related functionality.
Ruby Classes are used to specify both state and behaviour, a singular entity.
There is a maxim in software design that says that code is a liability, so use the less code possible. In the case of Ruby, the difference in code lines is cero. So you can use either way (if you don't need to save state)
If you want to be a purist, then use a Module, since you won't be using the State functionality. But I wouldn't say that using a class is wrong.
As a trivia info: In Ruby a Class is a kind of Module.
http://www.ruby-doc.org/core-1.9.3/Class.html
The following also works
Matrix = Object.new
def Matrix.add ...
def Matrix.equals ...
That's because so-called "class methods" are just methods added to a single object, and it doesn't really matter what that object class is.
As a matter of form, the Module is more correct. You can still create instances of the class, even if it has only class methods. You can think of a module here as a static class of C# or Java. Classes also always have the instance related methods (new, allocate, etc.). Use the Module. Class methods usually have something to do with objects (creating them, manipulating them).
I'm new to ruby metaprogramming, and I see people metaprogramming code in different places, like class Object, class Module, module Kernel and "nothing" (ie, out of a class/module definition block).
E.g.: I'm creating a c_attr_accessor method to access class variables, and I'm not sure where I must put the code, since it works in any of those cases.
How to decide what place is more appropriate to put a new global code?
Each of these examples fall into different cases.
If you are writing methods that apply to all objects, then you open the Object class so all objects can access it. If you are writing methods that apply to all modules, then you open Module. Whenever you open a class to add methods, the methods should apply to all instances of the class and nothing else.
Extending the Kernel module is different: people do this to add methods that should be available to every scope, but not really as methods to be explicitly called on an object, by making them private.
When you are outside of any class or module statement, you are in the scope of the main object, and methods you define default to being private methods of Object. This is fine for small or simple programs, but you will eventually want to use proper modules as namespaces to organize your methods.
As a final note on the subject, you always need to be sure that you really want methods you add to built-in classes and modules to be available to everything in your application, including external inclusions because they all share the built-ins.
Now to apply this to answer your question. Because you are defining a method that creates accessors for class variables, you should put it in the class Class as it applies to all classes and nothing else. Finally, you will likely only use it in class definitions (within a class statement), so we should make it private:
class Class
private
def c_attr_accessor(name)
# ...
end
end
class User
c_attr_accessor :class_variable_name
# ...
end
If you don't really need it in every class (maybe just a few), then create a "mixin module" to extend every class that needs this feature:
module ClassVariableAccessor
private
def c_attr_accessor(name)
# ...
end
end
class User
extend ClassVariableAccessor
c_attr_accessor :class_variable_name
# ...
end
Note that you are using Object#extend to add c_attr_accessor only to the object User (remember that classes are objects; you will hear that a lot if you are new to Ruby metaprogramming).
There is another way to implement the last example, which works by explicitly extending its base class through the Module#included(base_class) "hook method" called whenever the module included, and the base class is passed to base_class:
module ClassVariableAccessor
def included(base_class)
base_class.extend ClassMethods
end
module ClassMethods
def c_attr_accessor(name)
# ...
end
end
end
class User
include ClassVariableAccessor
c_attr_accessor :class_variable_name
# ...
end
I recommend this last solution because it is the most general and uses a simple interface that should not need to be updated. I hope this is not too much!
Have you tried looking up where the normal attribute accessors are defined? I'd either define it in the same class/module, or create my own module in which all my new methods go.
I am currently working through the Gregory Brown Ruby Best Practices book. Early on, he is talking about refactoring some functionality from helper methods on a related class, to some methods on module, then had the module extend self.
Hadn't seen that before, after a quick google, found out that extend self on a module lets methods defined on the module see each other, which makes sense.
Now, my question is when would you do something like this
module StyleParser
extend self
def process(text)
...
end
def style_tag?(text)
...
end
end
and then refer to it in tests with
#parser = Prawn::Document::Text::StyleParser
as opposed to something like this?
class StyleParser
def self.process(text)
...
end
def self.style_tag?(text)
...
end
end
is it so that you can use it as a mixin? or are there other reasons I'm not seeing?
A class should be used for functionality that will require instantiation or that needs to keep track of state. A module can be used either as a way to mix functionality into multiple classes, or as a way to provide one-off features that don't need to be instantiated or to keep track of state. A class method could also be used for the latter.
With that in mind, I think the distinction lies in whether or not you really need a class. A class method seems more appropriate when you have an existing class that needs some singleton functionality. If what you're making consists only of singleton methods, it makes more sense to implement it as a module and access it through the module directly.
In this particular case I would probably user neither a class nor a module.
A class is a factory for objects (note the plural). If you don't want to create multiple instances of the class, there is no need for it to exist.
A module is a container for methods, shared among multiple objects. If you don't mix in the module into multiple objects, there is no need for it to exist.
In this case, it looks like you just want an object. So use one:
def (StyleParser = Object.new).process(text)
...
end
def StyleParser.style_tag?(text)
...
end
Or alternatively:
class << (StyleParser = Object.new)
def process(text)
...
end
def style_tag?(text)
...
end
end
But as #Azeem already wrote: for a proper decision, you need more context. I am not familiar enough with the internals of Prawn to know why Gregory made that particular decision.
If it's something you want to instantiate, use a class. The rest of your question needs more context to make sense.