Ruby: Module, Mixins and Blocks confusing? - ruby

Following is the code I tried to run from the Ruby Programming Book
http://www.ruby-doc.org/docs/ProgrammingRuby/html/tut_modules.html
Why doesn't the product method give the right output?
I ran it with irb test.rb. And I am running Ruby 1.9.3p194.
module Inject
def inject(n)
each do |value|
n = yield(n, value)
end
n
end
def sum(initial = 0)
inject(initial) { |n, value| n + value }
end
def product(initial = 1)
inject(initial) { |n, value| n * value }
end
end
class Array
include Inject
end
[1, 2, 3, 4, 5].sum ## 15
[1, 2, 3, 4, 5].product ## [[1], [2], [3], [4], [5]]

Since that code example was written, Array has gained a #product method and you're seeing the output of that particular method. Rename your module's method to something like product_new.

Add this line at the end of your code :
p Array.ancestors
and you get (in Ruby 1.9.3) :
[Array, Inject, Enumerable, Object, Kernel, BasicObject]
Array is a subclass of Object and has a superclass pointer to Object. As Enumerable is mixed in (included) by Array, the superclass pointer of Array points to Enumerable, and from there to Object. When you include Inject, the superclass pointer of Array points to Inject, and from there to Enumerable. When you write
[1, 2, 3, 4, 5].product
the method search mechanism starts at the instance object [1, 2, 3, 4, 5], goes to its class Array, and finds product (new in 1.9) there. If you run the same code in Ruby 1.8, the method search mechanism starts at the instance object [1, 2, 3, 4, 5], goes to its class Array, does not find product, goes up the superclass chain, and finds product in Inject, and you get the result 120 as expected.
You find a good explanation of Modules and Mixins with graphic pictures in the Pickaxe http://pragprog.com/book/ruby3/programming-ruby-1-9
I knew I had seen that some are asking for a prepend method to include a module before, between the instance and its class, so that the search mechanism finds included methods before the ones of the class. I made a seach in SO with "[ruby]prepend module instead of include" and found among others this :
Why does including this module not override a dynamically-generated method?

By the way: in Ruby 2.0, there are two features which help you with both your problems.
Module#prepend prepends a mixin to the inheritance chain, so that methods defined in the mixin override methods defined in the module/class it is being mixed into.
Refinements allow lexically scoped monkeypatching.
Here they are in action (you can get a current build of YARV 2.0 via RVM or ruby-build easily):
module Sum
def sum(initial=0)
inject(initial, :+)
end
end
module ArrayWithSum
refine Array do
prepend Sum
end
end
class Foo
using ArrayWithSum
p [1, 2, 3].sum
# 6
end
p [1, 2, 3].sum
# NoMethodError: undefined method `sum' for [1, 2, 3]:Array
using ArrayWithSum
p [1, 2, 3].sum
# 6

In response to #zeronone "How can we avoid such namespace clashes?"
Avoid monkeypatching core classes wherever possible is the first rule. A better way to do this (IMO) would be to subclass Array:
class MyArray < Array
include Inject
# or you could just dispense with the module and define this directly.
end
xs = MyArray.new([1, 2, 3, 4, 5])
# => [1, 2, 3, 4, 5]
xs.sum
# => 15
xs.product
# => 120
[1, 2, 3, 4, 5].product
# => [[1], [2], [3], [4], [5]]
Ruby may be an OO language, but because it is so dynamic sometimes (I find) subclassing gets forgotten as a useful way to do things, and hence there is an over reliance on the basic data structures of Array, Hash and String, which then leads to far too much re-opening of these classes.

The following code is not very elaborated. Just to show you that today you already have means, like the hooks called by Ruby when certain events occur, to check which method (from the including class or the included module) will be used/not used.
module Inject
def self.append_features(p_host) # don't use included, it's too late
puts "#{self} included into #{p_host}"
methods_of_this_module = self.instance_methods(false).sort
print "methods of #{self} : "; p methods_of_this_module
first_letter = []
methods_of_this_module.each do |m|
first_letter << m[0, 2]
end
print 'selection to reduce the display : '; p first_letter
methods_of_host_class = p_host.instance_methods(true).sort
subset = methods_of_host_class.select { |m| m if first_letter.include?(m[0, 2]) }
print "methods of #{p_host} we are interested in: "; p subset
methods_of_this_module.each do |m|
puts "#{self.name}##{m} will not be used" if methods_of_host_class.include? m
end
super # <-- don't forget it !
end
Rest as in your post. Execution :
$ ruby -v
ruby 1.8.6 (2010-09-02 patchlevel 420) [i686-darwin12.2.0]
$ ruby -w tinject.rb
Inject included into Array
methods of Inject : ["inject", "product", "sum"]
selection to reduce the display : ["in", "pr", "su"]
methods of Array we are interested in: ["include?", "index",
..., "inject", "insert", ..., "instance_variables", "private_methods", "protected_methods"]
Inject#inject will not be used
$ rvm use 1.9.2
...
$ ruby -v
ruby 1.9.2p320 (2012-04-20 revision 35421) [x86_64-darwin12.2.0]
$ ruby -w tinject.rb
Inject included into Array
methods of Inject : [:inject, :product, :sum]
selection to reduce the display : ["in", "pr", "su"]
methods of Array we are interested in: [:include?, :index, ..., :inject, :insert,
..., :private_methods, :product, :protected_methods]
Inject#inject will not be used
Inject#product will not be used

Related

Reassign entire array to the same reference

I've searched extensively but sadly couldn't find a solution to this surely often-asked question.
In Perl I can reassign an entire array within a function and have my changes reflected outside the function:
#!/usr/bin/perl -w
use v5.20;
use Data::Dumper;
sub foo {
my ($ref) = #_;
#$ref = (3, 4, 5);
}
my $ref = [1, 2];
foo($ref);
say Dumper $ref; # prints [3, 4, 5]
Now I'm trying to learn Ruby and have written a function where I'd like to change an array items in-place by filtering out elements matching a condition and returning the removed items:
def filterItems(items)
removed, items = items.partition { ... }
After running the function, items returns to its state before calling the function. How should I approach this please?
I'd like to change an array items in-place by filtering out elements matching a condition and returning the removed items [...] How should I approach this please?
You could replace the array content within your method:
def filter_items(items)
removed, kept = items.partition { |i| i.odd? }
items.replace(kept)
removed
end
ary = [1, 2, 3, 4, 5]
filter_items(ary)
#=> [1, 3, 5]
ary
#=> [2, 4]
I would search for pass by value/reference in ruby. Here is one I found first https://mixandgo.com/learn/is-ruby-pass-by-reference-or-pass-by-value.
You pass reference value of items to the function, not the reference to items. Variable items is defined out of method scope and always refers to same value, unless you reassign it in the variable scope.
Also filterItems is not ruby style, see https://rubystyle.guide/
TL;DR
To access or modify an outer variable within a block, declare the variable outside the block. To access a variable outside of a method, store it in an instance or class variable. There's a lot more to it than that, but this covers the use case in your original post.
Explanation and Examples
In Ruby, you have scope gates and closures. In particular, methods and blocks represent scope gates, but there are certainly ways (both routine and meta) for accessing variables outside of your local scope.
In a class, this is usually handled by instance variables. So, as a simple example of String#parition (because it's easier to explain than Enumerable#partition on an Array):
def filter items, separator
head, sep, tail = items.partition separator
#items = tail
end
filter "foobarbaz", "bar"
#=> "baz"
#items
#=> "baz"
Inside a class or within irb, this will modify whatever's passed and then assign it to the instance variable outside the method.
Partitioning Arrays Instead of Strings
If you really don't want to pass things as arguments, or if #items should be an Array, then you can certainly do that too. However, Arrays behave differently, so I'm not sure what you really expect Array#partition (which is inherited from Enumerable) to yield. This works, using Enumerable#slice_after:
class Filter
def initialize
#items = []
end
def filter_array items, separator
#items = [3,4,5].slice_after { |i| i == separator }.to_a.pop
end
end
f = Filter.new
f.filter_array [3, 4, 5], 4
#=> [5]
Look into the Array class for any method which mutates the object, for example all the method with a bang or methods that insert elements.
Here is an Array#push:
ary = [1,2,3,4,5]
def foo(ary)
ary.push *[6, 7]
end
foo(ary)
ary
#=> [1, 2, 3, 4, 5, 6, 7]
Here is an Array#insert:
ary = [1,2,3,4,5]
def baz(ary)
ary.insert(2, 10, 20)
end
baz(ary)
ary
#=> [1, 2, 10, 20, 3, 4, 5]
Here is an example with a bang Array#reject!:
ary = [1,2,3,4,5]
def zoo(ary)
ary.reject!(&:even?)
end
zoo(ary)
ary
#=> [1, 3, 5]
Another with a bang Array#map!:
ary = [1,2,3,4,5]
def bar(ary)
ary.map! { |e| e**2 }
end
bar(ary)
ary
#=> [1, 4, 9, 16, 25]

Is there a default block argument in Ruby?

I am just starting to do Groovy after mostly doing ruby.
It has a default 'block argument', it, as it were, not officially the terminology for Groovy, but I'm new to Groovy.
(1..10).each {println(it)}
What about Ruby? Is there a default I can use so I don't have to make |my_block_arg| every time?
Thanks!
No, you don't have a "default" in Ruby.
Though, you can do
(1..10).each(&method(:puts))
Like Andrey Deinekos answer explained there is no default. You can set the self context using BasicObject#instance_eval or BasicObject#instance_exec. I don't recommend doing this since it can sometimes result in some unexpected results. However if you know what you're doing the following is still an option:
class Enumerator
def with_ie(&block)
return to_enum(__method__) { each.size } unless block_given?
each { |e| e.instance_eval(&block) }
end
end
(1..10).each.with_ie { puts self }
# 1
# 2
# 3
# 4
# 5
# 6
# 7
# 8
# 9
# 10
#=> 1..10
(1..10).map.with_ie { self * self }
#=> [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
(-5..5).select.with_ie { positive? }
#=> [1, 2, 3, 4, 5]
If you want to call one method you might as well do (-5..5).select(&:positive?), but when the objects you're iterating over have actual attributes it might be worth the trouble. For example:
people.map.with_ie { "#{id}: #{first_name} - #{last_name}" }
Keep in mind that if you have an local variable id, first_name or last_name in scope those are used instead of the methods on the object. This also doesn't quite work for hashes or Enumerable methods that pass more than one block argument. In this case self is set to an array containing the arguments. For example:
{a: 1, b: 2}.map.with_ie { self }
#=> [[:a, 1], [:b, 2]]
{a: 1, b: 2}.map.with_ie { self[0] }
#=> [:a, :b]
From Ruby 2.7 onwards, you can use numbered block arguments:
(1..10).each { puts _1 }
Granted, this hasn't been very well documented; some references are still using #1, but the above is tested on the official 2.7 version.

How to get reference to object you're calling in method?

I am trying to create a method for objects that I create. In this case it's an extension of the Array class. The method below, my_uniq, works. When I call puts [1, 1, 2, 6, 8, 8, 9].my_uniq.to_s it outputs to [1, 2, 6, 8].
In a similar manner, in median, I'm trying to get the reference of the object itself and then manipulate that data. So far I can only think of using the map function to assign a variable arr as an array to manipulate that data from.
Is there any method that you can call that gets the reference to what you're trying to manipulate? Example pseudo-code that I could replace arr = map {|n| n} with something like: arr = self.
class Array
def my_uniq
hash = {}
each do |num|
hash[num] = 0;
end
hash.keys
end
end
class Array
def median
arr = map {|n| n}
puts arr.to_s
end
end
Thanks in advance!
dup
class Array
def new_self
dup
end
def plus_one
arr = dup
arr.map! { |i| i + 1 }
end
def plus_one!
arr = self
arr.map! { |i| i + 1 }
end
end
array = [1, 3, 5]
array.new_self # => [1, 3, 5]
array.plus_one # => [2, 4, 6]
array # => [1, 3, 5]
array.plus_one! # => [2, 4, 6]
array # => [2, 4, 6]
dup makes a copy of the object, making it a safer choice if you need to manipulate data without mutating the original object. You could use self i.e. arr = self, but anything you do that changes arr will also change the value of self. It's a good idea to just use dup.
If you do want to manipulate and change the original object, then you can use self instead of dup, but you should make it a "bang" ! method. It is a convention in ruby to put a bang ! at the end of a method name if it mutates the receiving object. This is particularly important if other developers might use your code. Most Ruby developers would be very surprised if a non-bang method mutated the receiving object.
class Array
def median
arr = self
puts arr.to_s
end
end
[1,2,3].median # => [1,2,3]

Access `self` of an object through the parameters

Let's say I want to access an element of an array at a random index this way:
[1, 2, 3, 4].at(rand(4))
Is there a way to pass the size of the array like the following?
[1, 2, 3, 4].at(rand(le_object.self.size))
Why would I do that?--A great man once said:
Science isn't about why, it is about why not.
Not recommended, but instance_eval would somehow work:
[1, 2, 3, 4].instance_eval { at(rand(size)) }
And you can also break out of tap:
[1, 2, 3, 4].tap { |a| break a.at(rand(a.size)) }
There's an open feature request to add a method that yields self and returns the block's result. If that makes it into Ruby, you could write:
[1, 2, 3, 4].insert_method_name_here { |a| a.at(rand(a.size)) }
No, you can't do that. Receiver of a method (that array) is not accessible by some special name at the call site. Your best bet is assigning a name to that object.
ary = [1, 2, 3, 4]
ary.at(rand(ary.size))
Of course, if all you need is a random element, then .sample should be used. Which does not require evaluation of any arguments at the call site and its self is the array.
You can use instance_eval to execute ruby code with the binding of the array variable
[1, 2, 3, 4].instance_eval { at(rand(size)) }
Assuming you are interested in a random element as Array#at returns an element at given index, you can use Array#sample to pick a random element from an array.
[1,2,3,4].sample
#=> 3
If you do not want to use instance_eval (or any form of eval), then, you can add a method to Array class by monkey patching - generally speaking, I am not sure whether it's a wise idea to monkey patch though
class Array
def random_index
rand(size)
end
end
["a","b","c","d"].random_index
#=> 2
You could do something similar with lambda:
getrand = ->(x) { x[rand(x.count)] }
getrand.call [1,2,3]
# => 2

Why does Ruby's symbol to proc invoke the count method instead of the :count Hash key?

Given this irb session:
[2.0.0p195]> arr = [{count: 5}, {count: 6}, {count: 7}]
=> [{:count=>5}, {:count=>6}, {:count=>7}]
[2.0.0p195]> arr.collect(&:count)
=> [1, 1, 1]
wat
[2.0.0p195]> arr.collect(&:count).reduce(:+)
=> 3
[2.0.0p195]> arr.collect {|e| e[:count]}.reduce(:+)
=> 18
Can I exclude methods on Hash when collecting or is using a block the only way around this problem?
& means call #to_proc on its argument, and the Symbol class implements this by creating a Proc that calls the method name based on the symbol - so &:symbol means "Call the #symbol method on the passed in object". Essentially, what you've got is the equivalent of this:
arr.collect{|obj| obj.send(:count)}
Since Hash won't respond to the "count" method at all to get the value of the :count key - that is, Hash#count is not the same as Hash#[](:count), (though OpenStruct does do this for you), you're stuck with the block method.
Another alternative is to create a lambda, useful if you are writing the same block many times:
fetch_count = -> x{x[:count]}
arr.collect(&fetch_count) #=> [5, 6, 7]
# If hash only has one value as in example:
arr.collect(&values).flatten #=> [5, 6, 7]
The implementation of calling & on a symbol is as follows (more or less):
class Symbol
def to_proc
Proc.new { |obj| obj.send self }
end
end
You can see that all it is doing (when combined with a #map) is calling the method corresponding to the provided symbol on each member of the enumerable.
You could fix this if you really wanted by using OpenStructs instead of hashes, they have method-style access of elements:
[{test: 1}].map { |h| OpenStruct.new(h) }.map &:test
#=> [1]
Or invent an operator that does what you want for hash access in addition to &, I may revisit this challenge if I have a spare moment later!
EDIT: I have returned
This is hacky but you could monkey-patch symbol to provide the functionality that you wish for by augmenting with unary ~:
# Patch
class Symbol
def ~#
->(obj){ obj[self] }
end
end
# Example usage:
[{count: 5}, {count: 6}, {count: 7}].map &~:count
#=> [5, 6, 7]
If a free-for-all language such as Ruby doesn't have a feature that you wish for, you can always build it in :-)
Disclaimer: This is probably a terrible idea.

Resources