I am just starting Ruby and learning the concept of modules. I understand that one use of modules is to better organize your code and avoid name clashes. Let's say I have bunch of modules like this (I haven't included the implementation as that's not important)
:
module Dropbox
class Base
def initialize(a_user)
end
end
class Event < Base
def newFile?
end
def newImage?
end
end
class Action < Base
def saveFile(params)
end
end
end
and another module:
module CustomURL
class Base
def initialize(a_user, a_url, a_method, some_args, a_regex)
end
end
class Event < Base
def initialize(a_user, a_url, a_method, some_args, a_regex)
end
def change?
end
end
class Action < Base
def send_request(params)
end
end
end
I am going to have a bunch of these modules (10+, for gmail, hotmail, etc...). What I am trying to figure out is, is this the right way to organize my code?
Basically, I am using module to represent a "service" and all services will have a common interface class (base for initializing, action for list of actions and event for monitoring).
You are defining families of related or dependent classes here. Your usage of modules as namespaces for these families is correct.
Also with this approach it would be easy to build abstract factory for your classes if they had compatible interface. But as far as I see this is not the case for current classes design: for example Dropbox::Event and CustomURL::Event have completely different public methods.
You can reevaluate design of your classes and see if it is possible for them to have uniform interface so that you can use polymorphism and extract something like BaseEvent and BaseAction so that all Events and Actions will derive from these base classes.
Update: As far as you define services, it might be useful to define top-level module like Service and put all your classes inside this module. It will improve modularity of your system. If in the future you would refactor out some base classes for your modules services, you can put them in the top-level namespace. Then your objects will have readable names like these:
Service::Dropbox::Event
Service::Dropbox::Action
Service::CustomURL::Event
Service::CustomURL::Action
Service::BaseEvent
Service::BaseAction
I have some similar code at work, only I'm modeling networking gear.
I took the approach of defining a generic class with the common attributes and methods, including a generic comparator, and then sub-class that for the various models of hardware. The sub-classes contain the unique attributes for that hardware, plus all the support code necessary to initialize or compare an instance of that equipment with another.
As soon as I see the need to write a method similar to another I wrote I think about how I can reuse that code by promoting it to the base-class. Often this involves changing how I am passing parameters, and instead of using formal parameters, I end up using a hash, then pulling what I need from it, keeping the method interface under control.
Because you would have a lot of sub-classes to a base class, it's important to take your time and think out how that base-class should work. As you add sub-classes the task of refactoring the base will get harder because you will have to change other sub-classes. I always find I go down some blind-alleys and have to back up a bit, but as the class matures that should happen less and less.
As you will notice soon, there is no 'right way' of organizing code.
There are subtle differences in readability that are mostly subjective. The way you are organizing classes is just fine for releasing your code as a gem. It usually isn't needed in code that won't be included in other peoples projects, but it won't hurt either.
Just ask yourself "does this make sense for someone reading my code who has no idea what my intention is?".
Related
Is it conventional to define a subclass under it's parent class like below?
class Element
class Div < Element
end
class Paragraph < Element
end
end
Or is it more appropriate to make a module to contain the subclasses?
class Element
end
module Elements
class Div < Element
end
class Paragraph < Element
end
end
Or to create a "base" class in a module and define the subclasses within the same module?
module Element
class Base
end
class Div < Base
end
class Paragraph < Base
end
end
Or is it better to force a naming convention?
class Element
end
class DivElement < Element
end
class ParagraphElement < Element
end
It seems every library chooses a different namespacing/naming convention.
Which is the best to use?
What are the pros and cons of each?
TL;DR: The most conventional and best way to do this is with a module containing a base class and its subclasses. But let's go over everything.
There aren't many official sources on this; however, it is the style to use modules to contain libraries, code, and groups of classes.
Subclasses in Superclass
Pros:
Self-contained: The entire class system is contained under one name
Cons:
Unnatural: Who would think to look inside a superclass for a subclass?
Having the subclasses in a superclass really depends on situation. However, any benefits of this method are also achieved by the module method. But in practice, this just isn't done. Classes are here to contain methods, instance variables, class methods, etc. But classes can be thought of as a last level of nesting -- you don't have a class in a class unless it's a very specific circumstance.
The one case where I can think of this making sense is a case where the only way subclasses are used is through the superclass, for example a Formatter class that has internal subclassses like XML, PDF, etc. Let's say that you only use these classes by doing things like Formatter.new(:xml). But if we're doing this, the subclasses should be private and not accessible to the outside world anyway. And at that point, inheritance is a very C++y way and not Rubyish at all.
Base class outside module, subclasses within
Pros:
I can't think of any
Cons:
Implies Non-conected: If Element is not in the same namespace as its children, what, beyond the name tells me that it's even related?
This method is very unnatural. It makes it look as if Element has nothing to do with it's children, or if looked at differently, that it's children are internal implementation details that aren't to be dealt with. Either way it looks like shabby, sloppy naming and bad code structure planning. If I were reading code using this, I'd have to look at the contents of the Elements module to see that Element was subclassed at all -- and this isn't the most natural thing to do.
Class and subclasses in module (best solution)
Pros:
Contained: The superclass and all the Element classes are contained in one namespace, allowing them to be easily imported, required, iterated, etc. Also assists metaprogramming.
Includable: The classes can be easily includeed into any code.
Clear: There is an obvious association between Element and the subclasses. They are obviously a bundle of functionality.
Cons:
Encourages lazy naming: This does encourage you to name classes things like Base that are very ambiguous.
This is the best approach. It makes the classes a neat bundle, while still showing an obvious association and an obvious "Here, use my Div class" (as opposed to the subclasses-in-class strategy). Also, this is really helpful for metaprogramming, where having everything in a module is crucial to make things work. Finally, this works well with constructs like autoload, require_relative, include, etc. Those show that this is the way the language was designed to be used.
Force a naming convention
Pros:
Simple: No complexity here.
Removes ambiguity: Removes ambiguity from short names like Div or Para by turning them into DivElement and ParaElement.
Cons:
Archaic: Naming conventions to group classes or methods should only exist in languages that don't have a better way to do it, like C or Objective-C. C++ dropped it as soon as it got namespaces.
No programatic grouping: These naming conventions, while clear to humans, make the class structure very cloudy to metaprograming code, and make it impossible for the program to deal with the classes as a group
Pollutes global namespace: This creates many, many names in the global namespace, which is always a bad idea.
This is a very, VERY bad solution. It encorages writing sloppy, C-style code with little organization and little sense. These kinds of naming conventions should only be used in languages where there is no better solution, and Ruby has plenty of better solutions. Even defining all the classes in an array is better than a naming convention.
Note: However, if you really want to, you can define a naming convention on short names like Div or Para as long as you still keep them in a module, so that it's Elements::DivElement. However, this violates DRY, and I wouldn't suggest it.
Conclusion
So, you really have two options. Just put everything in a module:
module Elements
class Element; end
class Div < Element; end
#etc...
end
Or, put everything in a module with a naming convention:
module Elements
class Element; end
class DivElement < Element; end
#etc...
end
I sugest the former for clarity, use of standard methods, and metaprogramming reasons.
This is a problem I have faced many times - you have some shared functionality and a few different implementation classes that use it - it's natural for the shared functionality and the namespace for the implementation classes to have the same name.
I have never had a problem defining a subclass under its parent class as in your first example - except for the fact that it has sometimes confused other developers on my team. So depending on who you are working with and what their preferences are I don't think there's any problem with this approach.
I would still prefer the second approach, with a different name for the namespace, if there was a name that made sense. If there isn't, I'd use the first approach.
I would avoid using names like Base, as in my opinion this sort of generic name encourages you to just throw any old stuff in there, whereas a name that describes what the class does will (hopefully) make you think each time you add a method if it really belongs there.
And I'm really not a fan of compound names as in your naming convention example, which I justify to myself with a sort of vague feeling that it's like database normalisation - each field (or class name) should contain only one piece of information.
Anyway, I hope this helps you. I can't really draw from any sources other than my own experience (it's for you to decide if you think that's 'credible'), as I do think there's no absolute answer to this question. It is in a large part subjective.
Name space and inheritance are for different purposes. Use name space to encapsule a module within another. Use inheritance to define a module using the methods/variables/constants of another as a default.
It could happen in principle that you may want a module to both be within the name space of and inherit from the same single module, but that depends on your use case. Without discussing over a particular use case, it cannot be decided which way among the ones you presented is the best.
I decided to dig into the ActiveRecord code for Rails to try and figure out how some of it works, and was surprised to see it comprised of many modules, which all seem to get included into ActiveRecord::Base.
I know Ruby Modules provide a means of grouping related and reusable methods that can be mixed into other classes to extend their functionality.
However, most of the ActiveRecord modules seem highly specific to ActiveRecord. There seems to be references to instance variables in some modules, suggesting the modules are aware of the internals of the overall ActiveRecord class and other modules.
This got me wondering about how ActiveRecord is designed and how this logic could or should be applied to other Ruby applications.
It is a common 'design pattern' to split large classes into modules that are not really reusable elsewhere simply to split up the class file? Is it seen as good or bad design when modules make use of instance variables that are perhaps defined by a different module or part of the class?
In cases where a class can have many methods and it would become cumbersome to have them all defined in one file, would it make as much sense to simply reopen the class in other files and define more methods in there?
In a command line application I am working on, I have a few classes that do various functions, but I have a top level class that provides an API for the overall application - what I found is that class is becoming bogged down with a lot of methods that really hand off work to other class, and is like the glues that holds the pieces of the application together. I guess I am wondering if it would make sense for me to split out some of the related methods into modules, or re-open the class in different code files? Or is there something else I am not thinking of?
I've created quite a few modules that I didn't intend to be highly reusable. It makes it easier to test a group of related methods in isolation, and classes are more readable if they're only a few hundred lines long rather than thousands. As always, there's a balance to be struck.
I've also created modules that expect the including class to define instance methods so that methods defined on the module can use them. I wouldn't say it's terribly elegant, but it's feasible if you're only sharing code between a couple of classes and you document it well. You could also raise an exception if the class doesn't define the methods you want:
module Aggregator
def aggregate
unless respond_to?(:collection)
raise Exception.new("Classes including #{self} must define #collection")
end
# ...
end
end
I'd be a lot more hesitant to depend on shared instance variables.
The biggest problem I see with re-opening classes is simply managing your source code. Would you end up with multiple copies of aggregator.rb in different directories? Is the load order of those files determinate, and does that affect overriding or calling methods in the class? At least with modules, the include order is explicit in the source.
Update: In a comment, Stephen asked about testing a module that's meant to be included in a class.
RSpec offers shared_examples as a convenient way to test shared behavior. You can define the module's behaviors in a shared context, and then declare that each of the including classes should also exhibit that behavior:
# spec/shared_examples/aggregator_examples.rb
shared_examples_for 'Aggregator' do
describe 'when aggregating records' do
it 'should accumulate values' do
# ...
end
end
end
# spec/models/model_spec.rb
describe Model
it_should_behave_like 'Aggregator'
end
Even if you aren't using RSpec, you can still create a simple stub class that includes your module and then write the tests against instances of that class:
# test/unit/aggregator_test.rb
class AggregatorTester
attr_accessor :collection
def initialize(collection)
self.collection = collection
end
include Aggregator
end
def test_aggregation
assert_equal 6, AggregatorTester.new([1, 2, 3]).aggregate
end
so I have an ruby object that i need to create as a pdf and excel row and cvs row
so far I've created a new class with a method to take in the object and do the necessary stuff to produce the pdf , excel , csv
I've been reading Agile Software Development, Principles, Patterns, and Practices and it mentioned the extension method so i was going to do but since this is ruby should i just be reopening the class in the another file and added the methods on there to separate them from the main class
so
file ruby_model.rb
class RubyModel < ActiveRecord::Base
end
then do
ruby_model_pdf.rb
class RubyModel
def to_pdf
end
end
ruby_model_cvs.rb
class RubyModel
def to_csv
end
end
or should i go with with the object extension pattern?
Cheers
You should put your methods in a module and include the module in the class. This way is preferable because it's easier to see where the methods came from (in a backtrace, for example), and it's easier to reuse the methods if it turns out that they can be used in other classes too.
For example:
module Conversions
def to_pdf
end
def to_csv
end
end
class RubyModel
include Conversions
end
It might also be a good idea to put to_pdf and to_csv in different modules, unless it's the case that if you want to mix in one you always want to mix in the other.
This all assumes that the methods don't belong in the class itself, but judging from the names they don't.
If the language feature works fine, then keep it simple and use it.
Design patterns are documented workarounds for cases where the language is not expressive enough. A Ruby example would be Iterator, which is made redundant by blocks and Enumerable.
I am currently working through the Gregory Brown Ruby Best Practices book. Early on, he is talking about refactoring some functionality from helper methods on a related class, to some methods on module, then had the module extend self.
Hadn't seen that before, after a quick google, found out that extend self on a module lets methods defined on the module see each other, which makes sense.
Now, my question is when would you do something like this
module StyleParser
extend self
def process(text)
...
end
def style_tag?(text)
...
end
end
and then refer to it in tests with
#parser = Prawn::Document::Text::StyleParser
as opposed to something like this?
class StyleParser
def self.process(text)
...
end
def self.style_tag?(text)
...
end
end
is it so that you can use it as a mixin? or are there other reasons I'm not seeing?
A class should be used for functionality that will require instantiation or that needs to keep track of state. A module can be used either as a way to mix functionality into multiple classes, or as a way to provide one-off features that don't need to be instantiated or to keep track of state. A class method could also be used for the latter.
With that in mind, I think the distinction lies in whether or not you really need a class. A class method seems more appropriate when you have an existing class that needs some singleton functionality. If what you're making consists only of singleton methods, it makes more sense to implement it as a module and access it through the module directly.
In this particular case I would probably user neither a class nor a module.
A class is a factory for objects (note the plural). If you don't want to create multiple instances of the class, there is no need for it to exist.
A module is a container for methods, shared among multiple objects. If you don't mix in the module into multiple objects, there is no need for it to exist.
In this case, it looks like you just want an object. So use one:
def (StyleParser = Object.new).process(text)
...
end
def StyleParser.style_tag?(text)
...
end
Or alternatively:
class << (StyleParser = Object.new)
def process(text)
...
end
def style_tag?(text)
...
end
end
But as #Azeem already wrote: for a proper decision, you need more context. I am not familiar enough with the internals of Prawn to know why Gregory made that particular decision.
If it's something you want to instantiate, use a class. The rest of your question needs more context to make sense.
I was reading up on Ruby, and learned about its mixins pattern, but couldn't think of many useful mixin functionality (because I'm not used to thinking that way most likely). So I was wondering what would be good examples of useful Mixin functionality?
Thanks
Edit: A bit of background. I'm Coming from C++, and other Object languages, but my doubt here is that Ruby says it's not inheriting mixins, but I keep seeing mixins as Multiple inheritance, so I fear I'm trying to categorize them too soon into my comfort zone, and not really grok what a mixin is.
They are usually used to add some form of standard functionality to a class, without having to redefine it all. You can probably think of them a bit like interfaces in Java, but instead of just defining a list of methods that need to be implemented, many of them will actually be implemented by including the module.
There are a few examples in the standard library:
Singleton - A module that can be mixed into any class to make it a singleton. The initialize method is made private, and an instance method added, which ensures that there is only ever one instance of that class in your application.
Comparable - If you include this module in a class, defining the <=> method, which compares the current instance with another object and says which is greater, is enough to provide <, <=, ==, >=, >, and between? methods.
Enumerable - By mixing in this module, and defining an each method, you get support for all the other related methods such as collect, inject, select, and reject. If it's also got the <=> method, then it will also support sort, min, and max.
DataMapper is also an interesting example of what can be done with a simple include statement, taking a standard class, and adding the ability to persist it to a data store.
Well the usual example I think is Persistence
module Persistence
def load sFileName
puts "load code to read #{sFileName} contents into my_data"
end
def save sFileName
puts "Uber code to persist #{#my_data} to #{sFileName}"
end
end
class BrandNewClass
include Persistence
attr :my_data
def data=(someData)
#my_data = someData
end
end
b = BrandNewClass.new
b.data = "My pwd"
b.save "MyFile.secret"
b.load "MyFile.secret"
Imagine the module is written by a Ruby ninja, which persists the state of your class to a file.
Now suppose I write a brand new class, I can reuse the functionality of persistence by mixing it in by saying include ModuleILike. You can even include modules at runtime. I get load and save methods for free by just mixing it in. These methods are just like the ones that you wrote yourself for your class. Code/Behavior/Functionality-reuse without inheritance!
So what you're doing is including methods to the method table for your class (not literally correct but close).
In ruby, the reason that Mixins aren't multiple-inheritance is that combining mixin methods is a one time thing. This wouldn't be such a big issue, except that Ruby's modules and classes are open to modification. This means that if you mixin a module to your class, then add a method to the module, the method will not be available to your class; where if you did it in the opposite order, it would.
It's like ordering an ice-cream cone. If you get chocolate sprinkles and toffee bits as your mixins, and walk away with your cone, what kind of ice cream cone you have won't change if someone adds multicolored sprinkles to the chocolate sprinkles bin back at the ice-cream shop. Your class, the ice cream cone, isn't modified when the mixin module, the bin of sprinkles is. The next person to use that mixin module will see the changes.
When you include a module in ruby, it calls Module#append_features on that module, which add a copy of that module's methods to the includer one time.
Multiple inheritance, as I understand it, is more like delegation. If your class doesn't know how to do something, it asks its parents. In an open-class environment, a class's parents may have been modified after the class was created.
It's like a RL parent-child relationship. Your mother might have learned how to juggle after you were born, but if someone asks you to juggle and you ask her to either: show you how (copy it when you need it) or do it for you (pure delegation), then she'll be able at that point, even though you were created before her ability to juggle was.
It's possible that you could modify a ruby module 'include' to act more like multiple inheritance by modifying Module#append_features to keep a list of includers, and then to update them using the method_added callback, but this would be a big shift from standard Ruby, and could cause major issues when working with others code. You might be better creating a Module#inherit method that called include and handled delegation as well.
As for a real world example, Enumerable is awesome. If you define #each and include Enumerable in your class, then that gives you access to a whole host of iterators, without you having to code each and every one.
It is largely used as one might use multiple inheritance in C++ or implementing interfaces in Java/C#. I'm not sure where your experience lies, but if you have done those things before, mixins are how you would do them in Ruby. It's a systemized way of injecting functionality into classes.