Here is a code sample from the ruby pickaxe book:
Class VowelFinder
include Enumerable
def initialize(string)
#string = string
end
def each
#string.scan(/[aeiou]/] do |vowel|
yield vowel
end
end
end
vf = VowelFinder.new("the quick brown fox jumped")
vf.inject(:+) # => 'euiooue'
What I am having a hard time understanding is where the modified 'each' method comes into play? I assume inject is calling it at some point but don't understand why it's doing it, and how I could predict or emulate this behavior in my code.
Thanks!
the VowelFinder class implements the protocol that is mandatory for the Enumerable module that you can include to gain a lot of ruby iterator methods: http://apidock.com/ruby/Enumerable
The Enumerable mixin provides collection classes with several
traversal and searching methods, and with the ability to sort. The
class must provide a method each, which yields successive members of
the collection. If Enumerable#max, #min, or #sort is used, the objects
in the collection must also implement a meaningful <=> operator, as
these methods rely on an ordering between members of the collection.
The call to inject actually causes a call to each. The way inject works is by going over all the elements of the Enumerable using each, applying the passed operator to the memo and to the next value and storing the result back into memo. The result of the entire call is the value of memo in the end. You can see the documentation in here.
You are correct that inject actually calls each method and the reason this happens is because inject actually uses each method internally. In fact, a lot if the Enumerable methods use the .each method; like #map, #select, #reject.
The reason for this is that .each is the actual method which allows looping in an Enumerable Object. The other methods are just so that things get easy for us, developers, and we don't have to use .each everywhere. Imagine having to write the above code using .each. It wouldn't be tough but vf.inject(:+) is definitely easier. So is the case with collect or map.
And the best way to implement this(as is done in Ruby) without repeating code is to have the other method call #each as there is no code duplication and each of #map or #collect or #inject does not have to traverse the Enumerable Object differently.
Genrally, I avoid using #each and there are so many other methods which makes our job easier. If you want to modify your array collection, it's advisable not to money-patch the #each method (unless, of course, you want all the other methods to change as well.)
Related
Most of the Factorybot factories are like:
FactoryBot.define do
factory :product do
association :shop
title { 'Green t-shirt' }
price { 10.10 }
end
end
It seems that inside the ":product" block we are building a data structure, but it's not the typical hashmap, the "keys" are not declared through symbols and commas aren't used.
So my question is: what kind of data structure is this? and how it works?
How declaring "association" inside the block doesn't trigger a:
NameError: undefined local variable or method `association'
when this would happen on many other situations. Is there a subject in compsci related to this?
The block is not a data structure, it's code. association and friends are all method calls, probably being intercepted by method_missing. Here's an example using that same technique to build a regular hash:
class BlockHash < Hash
def method_missing(key, value=nil)
if value.nil?
return self[key]
else
self[key] = value
end
end
def initialize(&block)
self.instance_eval(&block)
end
end
With which you can do this:
h = BlockHash.new do
foo 'bar'
baz :zoo
end
h
#=> {:foo=>"bar", :baz=>:zoo}
h.foo
#=> "bar"
h.baz
#=> :zoo
I have not worked with FactoryBot so I'm going to make some assumptions based on other libraries I've worked with. Milage may vary.
The basics:
FactoryBot is a class (Obviously)
define is a static method in FactoryBot (I'm going to assume I still haven't lost you ;) ).
Define takes a block which is pretty standard stuff in ruby.
But here's where things get interesting.
Typically when a block is executed it has a closure relative to where it was declared. This can be changed in most languages but ruby makes it super easy. instance_eval(block) will do the trick. That means you can have access to methods in the block that weren't available outside the block.
factory on line 2 is just such a method. You didn't declare it, but the block it's running in isn't being executed with a standard scope. Instead your block is being immediately passed to FactoryBot which passes it to a inner class named DSL which instance_evals the block so its own factory method will be run.
line 3-5 don't work that way since you can have an arbitrary name there.
ruby has several ways to handle missing methods but the most straightforward is method_missing. method_missing is an overridable hook that any class can define that tells ruby what to do when somebody calls a method that doesn't exist.
Here it's checking to see if it can parse the name as an attribute name and use the parameters or block to define an attribute or declare an association. It sounds more complicated than it is. Typically in this situation I would use define_method, define_singleton_method, instance_variable_set etc... to dynamically create and control the underlying classes.
I hope that helps. You don't need to know this to use the library the developers made a domain specific language so people wouldn't have to think about this stuff, but stay curious and keep growing.
Ruby doesn't have interfaces, but how tell other programmers what need to include current module in class, like instance variables, methods, constants, etc?
There's no way of formally defining what's required, it's up to you to document it clearly. The reason for this is Ruby is very dynamic by design so static tests won't work, the problems they detect might be rectified by the time the code is actually executed. Likewise, something that might seem correct could be broken later on by some other code.
C++, Java, and even Objective-C and Swift can do compile-time checking to enforce these things. Once a class is defined it cannot be undefined. Once a method is created it can't be removed. This is not the case in a Smalltalk-derived language like Ruby.
Ruby has no ability to test these things up-front since what the program actually does can change radically from the time the code is loaded and parsed, and when it's actually executed.
If you have a particularly complicated footprint you might want to write a method for testing it that can be exercised to verify that everything's working correctly. That can be called by the programmer whenever they think they're ready.
The only way to verify that your Ruby code is running correctly is to run it. No amount of static analysis will ever come close to that.
There are some existing mixins, classes, and methods in the Ruby core library that have the exact same problem, e.g. Enumerable, Comparable, Range, Hash, Array#uniq: they require certain behavior from other objects in order to work. Some examples are:
Enumerable:
The class must provide a method each, which yields successive members of the collection. If Enumerable#max, #min, or #sort is used, the objects in the collection must also implement a meaningful <=> operator […]
Comparable:
The class must define the <=> operator, which compares the receiver against another object, returning -1, 0, or +1 depending on whether the receiver is less than, equal to, or greater than the other object. If the other object is not comparable then the <=> operator should return nil.
Range:
Ranges can be constructed using any objects that can be compared using the <=> operator. Methods that treat the range as a sequence (#each and methods inherited from Enumerable) expect the begin object to implement a succ method to return the next object in sequence. The step and include? methods require the begin object to implement succ or to be numeric.
Hash:
A user-defined class may be used as a hash key if the hash and eql? methods are overridden to provide meaningful behavior.
And in order to define what "meaningful behavior" means, the documentation of Hash further links to the documentation of Object#hash and Object#eql?:
Object#hash:
[…] This function must have the property that a.eql?(b) implies a.hash == b.hash. […]
Object#eql?:
[…] The eql? method returns true if obj and other refer to the same hash key. […]
So, as you can see, your question is a quite common one, and the answer is: documentation.
This is a subjective question, so with that in mind here is my opinion.
Maybe you could create a base class that your objects inherit from.
for example:
class BaseA
def say(msg)
raise NotImplementedError
end
end
class A < BaseA
def say(msg)
puts "saying #{msg}"
end
end
Even though this isn't a real interface you can "pretend" it's one and have all the classes that need BaseA's methods and then override them with the real behavior. Then I suppose developers could just look at the base class to see what methods need to implemented.
If we call caller method, we get something like:
prog.rb:3:in `a'
prog.rb:6:in `b'
prog.rb:9:in `c'
This is helpful for humans, but if I wanted to analyze the stack programmatically, not really, as two methods called :a may be entirely unrelated.
Is there any way/method to extract information about the receiver of the methods (like its class or object id) as well? For example:
prog.rb:3:in `Klass#a'
prog.rb:6:in `Modoole#b'
prog.rb:9:in `OtherKlass#c'
Formatting is only an example; this info might be an Array or anything.
I'm trying to emulate this with TracePoint, but forming a separate stack is a bad solution. Is there any Ruby way I missed in the docs?
There's an alternative to Kernel#caller named Kernel#caller_locations, that returns an array of Thread::Backtrace::Location objects. According to the manual, these should in theory be able to give you this information through the #label method.
Returns the label of this frame.
Usually consists of method, class, module, etc names with decoration.
After trying this out however, I need to question the term usually in the docs, because it seems to only return the method name still. Unless usually means it works for you, there seems to be no way of accomplishing this as of now.
Edit:
As per comment, one case that satisfies the condition of usually is when the method call is coming from within a Class or Module body:
class A
def trace
puts caller_locations.first.label
end
end
class B
A.new.trace
end
#=> <class:B>
module C
A.new.trace
end
#=> <module:C>
I have a class called Note, which includes an instance variable called time_spent. I want to be able to do something like this:
current_user.notes.inject{|total_time_spent,note| total_time_spent + note.time_spent}
Is this possible by mixing in the Enumerable module? I know you are supposed to do add include Enumerable to the class and then define an each method, but should the each method be a class or instance method? What goes in the each method?
I'm using Ruby 1.9.2
It's easy, just include the Enumerable module and define an each instance method, which more often than not will just use some other class's each method. Here's a really simplified example:
class ATeam
include Enumerable
def initialize(*members)
#members = members
end
def each(&block)
#members.each do |member|
block.call(member)
end
# or
# #members.each(&block)
end
end
ateam = ATeam.new("Face", "B.A. Barracus", "Murdoch", "Hannibal")
#use any Enumerable method from here on
p ateam.map(&:downcase)
For further info, I recommend the following article: Ruby Enumerable Magic: The Basics.
In the context of your question, if what you expose through an accessor already is a collection, you probably don't need to bother with including Enumerable.
The Enumerable documentations says the following:
The Enumerable mixin provides collection classes with several
traversal and searching methods, and with the ability to sort. The
class must provide a method each, which yields successive members of
the collection. If Enumerable#max, #min, or #sort is used, the objects
in the collection must also implement a meaningful <=> operator, as
these methods rely on an ordering between members of the collection.
This means implementing each on the collection. If you're interested in using #max, #min or #sort you should implement <=> on its members.
See: Enumerable
we can call the Array method in the top level like this
Array(something)
that makes sense to me, it's a method call without explicit receiver, and self, which is main in this case, is inserted at the front of the method call. But isn't it that this is equivalent to :
Kernel.Array(something)
this doesn't make sense to me. Since in the first case, the object main is of class Object, which got Kernel module mixed in, thus have the Array method. But in the second case, we are calling the Array method on the Kernel module object itself, rather than main object, didn't they are NOT the same thing?
sorry for my bad english.
Kernel.Array is what is known as a module function. Other examples of module functions include Math.sin, and Math.hypot and so on.
A module function is a method that is both a class method on the module and also a private instance method. When you invoke Array() at the top-level you are invoking it as a private instance method of the main object. When you invoke it through Kernel.Array() you are invoking it as a class method on Kernel. They are the same method.
To learn more, read up on the module_function method in rubydocs: http://www.ruby-doc.org/core/classes/Module.html#M001642
class Object mixed-in module Kernel, but Kernel is an instance of Object. So Kernel "module" methods - is it's instance methods.
What's confusing you is the difference between class and instance methods.
Class methods don't have an explicit receiver, and thus no self to access other fields with. They just... are.
Generally instance methods are used to query or manipulate the attributes of a given object, whereas the class methods are "helper" or "factory" methods that provide some functionality associated with or especially useful for a certain kind of class, but not dependent on actual live instances (objects) of that class.
Not sure about Ruby, but Java has (for example) a whole class, Math that contains nothing but instance methods like sin(), max(), exp() and so forth: There is no "Math" object, these are just methods that embody mathematical algorithms. Not the best example, because in Ruby those methods are probably embedded right into the numeric classes as instance methods.
The case you mention is a bit confusing because Array's () method and Kernel's Array() method are in fact different methods that do similar things. Both are class methods.
Array() takes a list of arguments and makes and returns an array containing them.
Kernel.Array() takes a single argument of an "array-able" type, such as a sequence, and takes the values returned by this argument and builds an array from those.
UPDATE
The downvote was perhaps justified; I apologize for taking on a subject outside my area of expertise. I think I'll be deleting this answer soon.
# Chuck: I would sincerely hope that a language/library's official documentation would offer some meaningful clues as to how it works. This is what I consulted in answering this question.
The rdoc for Kernel.Array():
Returns arg as an Array. First tries to call arg.to_ary, then arg.to_a. If both fail, creates a single element array containing arg (unless arg is nil).
for Array.():
Returns a new array populated with the given objects.
I don't know about you, but I think if the docs vary that much then either they're talking about separate methods or the documentation is a train wreck.
# freeknight:
But everything in ruby is an object of some kind, even classes and modules. And Kernel.Array is actually a method call on an specific object - the Kernel object.
Yeah, under the covers it's similar in Java too. But the Array() method isn't doing anything with Kernel, any more than Array() is doing anything with the Array class object, so this is really only a semantic quibble. It's an instance method because you could hang it off class IPSocket if you were crazy enough, and it would still work the same way.
They are the same thing:
a = Kernel.Array('aa')
=> ["aa"]
a.class
=> Array
a = Array('aaa')
=> ["aaa"]
a.class
=> Array
Maybe there is an alias?