I'm writing an interactive shell (just as irb) in ruby, for another language. I thought of writing all the functions regarding the shell (i.e read input and parse) in a module, then doing class Object; include Shell; end.
My doubt is: From my readings from Ruby under a microscope, I learned that in CRuby (what I'm using), when you include a module, ruby creates a copy of this module and appends it as the super class of the class that includes it, so it seems that this would waste a bit of memory, but then I looked into the kernel module, which AFAIK is included in object, and this seems okay.
Finally, the question is, Would it be a good practice to include Shell into Object?
running class Object; include Shell; end
is the same as just include Shell, since Object is the default scope.
This is not bad for memory, it's a pretty fundamental behavior.
If you want to avoid working on the Object (global) scope, you can define a custom module that includes Shell:
class App
include Shell
def self.begin
# do something with shell
end
end
App.begin
though if you're making a REPL and do want your methods available on the global scope, then saying include Shell without a wrapper class seems a good approach.
Related
In the book OO Design in Ruby, Sandi Metz says that the main use of modules is to implement duck types with them and include them in every class needed. Why is the Ruby Kernel a module included in Object? As far as I know it isn't used anywhere else. What's the point of using a module?
Ideally,
Methods in spirit (that are applicable to any object), that is, methods that make use of the receiver, should be defined on the Object class, while
Procedures (provided globally), that is, methods that ignore the receiver, should be collected in the Kernel module.
Kernel#puts, for example doesn't do anything with its receiver; it doesn't call private methods on it, it doesn't access any instance variables of it, it only acts on its arguments.
Procedures in Ruby are faked by using Ruby's feature that a receiver that is equal to self can be omitted. They are also often made private to prevent them from being called with an explicit receiver and thus being even more confusing. E.g., "Hello".puts would print a newline and nothing else since puts only cares about its arguments, not its receiver. By making it private, it can only be called as puts "Hello".
In reality, due to the long history of Ruby, that separation hasn't always been strictly followed. It is also additionally complicated by the fact that some Kernel methods are documented in Object and vice versa, and even further by the fact that when you define something which looks like a global procedure, and which by the above reasoning should then end up in Kernel, it actually ends up as a private instance method in Object.
As you already pointed out: Modules provide a way to collect and structure behavior, so does the Kernel module. This module is mixed in early into the class Object so every Ruby class will provide these methods. There is only a BasicObject before in hierarchy, it's child Objects purpose is only to get extended by the Kernel methods. BasicObject has only 7 methods that very very basic like new, __send__ or __id__.
class Object < BasicObject
include Kernel # all those many default methods we appreciate :)
end
This is a simple question: Should I have a module that contains all my classes (and submodules):
module ProjectName
class Something
# code
end
module Abc
# code
end
end
Or simply everything in a global scope:
class Something
# code
end
module Abc
# code
end
It is considered good practice not to pollute your global scope. Namespacing your application into modules, encapsulating related behaviour makes it easier to comprehend, helps avoid naming conflicts, and lets you easily port parts of your code into other applications or contexts.
In Ruby it also gives you a natural way of storing module wide constants, and gives you the option to add methods that don't need a containing object directly to the module.
In some languages, (notably JavaScript) scoping also has an impact on performance, as keeping objects in the global scope might prevent them from getting qualified for garbage collection.
All over the Internet, I see people using "include" to bring new functionality to the main scope.
Most recenty, I saw this in an SO answer:
require 'fileutils' #I know, no underscore is not ruby-like
include FileUtils
# Gives you access (without prepending by 'FileUtils.') to
cd(dir, options)
cd(dir, options) {|dir| .... }
pwd()
Extend works too, but from my understanding, ONLY extend should work. I don't even know why the main object has the include functionality.
Inside the main scope:
self.class == Object #true
Object.new.include #NoMethodError: undefined method `include' for #<Object:0x000000022c66c0>
An I mean logically, since self.is_a?(Module) == false when inside the main scope, main shouldn't even have the include functionality, since include is used to add methods to child instances and main isn't a class or a module so there are no child instances to speak of.
My question is, why does include work in the main scope, what was the design decision that led to making it work there, and lastly, shouldn't we prefer "extend" in that case so as to not make the "extend/include" functionality even more confusing than it already may be thanks to people often using ruby hooks to invoke the other when one is called.
The main scope is "special". Instance methods defined at the main scope become private instance methods of Object, constants become constants of Object, and include includes into Object. There's probably other things I'm missing. (E.g. presumably, prepend prepends to Object, I never thought about it until now.)
Overview:
main.rb
items/
one.rb
two.rb
three.rb
Every file in items/ should have a human readable description (serialization is out), like so (but maybe a DSL would be better?):
class One < BaseItem
name "Item one"
def meth
"something"
end
main.rb should be able to instantiate all objects from the items/ directory. How could this be accomplished? Not familiar with Ruby, I see the object model allows for some pretty cool things (those class hooks, etc), but I'm having trouble finding a way to solve this.
Any input way appreciated.
EDIT:
Shoot, I may have missed the gist of it - what I didn't mention was the stuff in the items/ dir would be dynamic — treat items as plugins, I'd want main.rb to autodetect everything in that dir at runtime (possibly force a reload during execution). main.rb has no prior knowledge of the objects in there, it just knows what methods to expect from them.
I've looked at building DSLs, considering defining (in main.rb) a spawn function that takes a block. A sample file in items/ would look something like:
spawn do
name "Item name"
def foo
"!"
end
end
And the innards of spawn would create a new object of the base type and pass the block to instance_eval. That meant I'd need to have a method name to set the value, but incidentally, I also wanted the value to be accessible under name, so I had to go around it renaming the attr.
I've also tried the inherit route: make every item file contain a class that inherits from a BaseItem of sorts, and hook into it via inherited ... but that didn't work (the hook never fired, I've lost the code now).
EDIT2:
You could look at what homebrew does with its formulas, that's very close to what I'd want - I just didn't have the ruby prowess to reverse engineer how it handles a formula.
It all boils down to requiring those files, and make sure that you implemented the functionality you want in them.
If you want a more specific response, you need to ask a more specific question.
I am no expert on object persistence, but answer to your specific question is, that you have 2 good choices: One is YAML, and the other is Ruby itself: a DSL written by you or someone else, and specific to your business logic.
But I think that more general answer would require reviewing object persistance in Ruby more systematically. For example, ActiveRecord::Base descendants persists as database tables. There are other ways, I found eg. this http://stone.rubyforge.org/ by googling. This is my problem as well, I'm facing the same question as you in my work.
What you are asking for looks and smells a lot like a normal Ruby script.
class One < BaseItem
name "Item one"
def meth
"something"
end
We'd close the class definition with another end statement. name "Item one" would probably be done inside the initialize method, by setting an instance variable:
attr_reader :name
def initialize(name)
#name = name
end
Typically we wouldn't call the folder "items", but instead it would be "lib", but otherwise what you are talking about is very normal and expected.
Instantiating all items in a folder is easily done by iterating over the folder's contents, requiring the files, and calling the new method for that item. You can figure out the name by mapping the filename to the class name, or by initializing an instance at the end of the file:
one = One.new("item one")
You could keep track of the items loaded in an array or hash, or just hardwire them in. It's up to you, since this is your code.
It sounds like you haven't tried writing any Ruby scripts, otherwise you would have found this out already. Normal Ruby programming books/documentation would have covered this. As is, the question is akin to premature optimization, and working with the language would have given you the answer.
In ruby, I understand that module functions can be made available without mixing in the module by using module_function as shown here. I can see how this is useful so you can use the function without mixing in the module.
module MyModule
def do_something
puts "hello world"
end
module_function :do_something
end
My question is though why you might want to have the function defined both of these ways.
Why not just have
def MyModule.do_something
OR
def do_something
In what kind of cases would it be useful to have the function available to be mixed in, or to be used as a static method?
Think of Enumerable.
This is the perfect example of when you need to include it in a module. If your class defines #each, you get a lot of goodness just by including a module (#map, #select, etc.). This is the only case when I use modules as mixins - when the module provides functionality in terms of a few methods, defined in the class you include the module it. I can argue that this should be the only case in general.
As for defining "static" methods, a better approach would be:
module MyModule
def self.do_something
end
end
You don't really need to call #module_function. I think it is just weird legacy stuff.
You can even do this:
module MyModule
extend self
def do_something
end
end
...but it won't work well if you also want to include the module somewhere. I suggest avoiding it until you learn the subtleties of the Ruby metaprogramming.
Finally, if you just do:
def do_something
end
...it will not end up as a global function, but as a private method on Object (there are no functions in Ruby, just methods). There are two downsides. First, you don't have namespacing - if you define another function with the same name, it's the one that gets evaluated later that you get. Second, if you have functionality implemented in terms of #method_missing, having a private method in Object will shadow it. And finally, monkey patching Object is just evil business :)
EDIT:
module_function can be used in a way similar to private:
module Something
def foo
puts 'foo'
end
module_function
def bar
puts 'bar'
end
end
That way, you can call Something.bar, but not not Something.foo. If you define any other methods after this call to module_function, they would also be available without mixing in.
I don't like it for two reasons, though. First, modules that are both mixed in and have "static" methods sound a bit dodgy. There might be valid cases, but it won't be that often. As I said, I prefer either to use a module as a namespace or mix it in, but not both.
Second, in this example, bar would also be available to classes/modules that mix in Something. I'm not sure when this is desirable, since either the method uses self and it has to be mixed in, or doesn't and then it does not need to be mixed in.
I think using module_function without passing the name of the method is used quite more often than with. Same goes for private and protected.
It's a good way for a Ruby library to offer functionality that does not use (much) internal state. So if you (e.g.) want to offer a sin function and don't want to pollute the "global" (Object) namespace, you can define it as class method under a constant (Math).
However, an app developer, who wants to write a mathematical application, might need sin every two lines. If the method is also an instance method, she can just include the Math (or My::Awesome::Nested::Library) module and can now directly call sin (stdlib example).
It's really about making a library more comfortable for its users. They can choose themself, if they want the functionality of your library on the top level.
By the way, you can achieve a similar functionality like module_function by using: extend self (in the first line of the module). To my mind, it looks better and makes things a bit clearer to understand.
Update: More background info in this blog article.
If you want to look at a working example, check out the chronic gem:
https://github.com/mojombo/chronic/blob/master/lib/chronic/handlers.rb
and Handlers is being included in the Parser class here:
https://github.com/mojombo/chronic/blob/master/lib/chronic/parser.rb
He's using module_function to send the methods from Handlers to specific instances of Handler using that instance's invoke method.