How is Array#each implemented in Ruby? - ruby

I'm curious about the mechanism of collection and block in Ruby. We can define a class like this:
class Foo
include Enumerable
def initialize
#data = []
end
def each
if block_given?
#data.each { |e| yield(e) }
else
Enumerator.new(self, :each)
end
end
end
I want to know how #data.each can use the block { |e| yield(e) } as the parameter. I searched the implementation of Array#each:
rb_ary_each(VALUE array)
{
long i;
volatile VALUE ary = array;
RETURN_SIZED_ENUMERATOR(ary, 0, 0, ary_enum_length);
for (i=0; i<RARRAY_LEN(ary); i++) {
rb_yield(RARRAY_AREF(ary, i));
}
return ary;
}
but this doesn't even seem like a Ruby language implementation. How is Array#each implemented in Ruby?
Edit:
It looks like C code traverses the first to final array element and call ruby yield function with the parameter of the traversed array element. But where is the block passed to rb_ary_each function?

Here's a simple Ruby implementation of an each-like method that doesn't rely on the existence of any of Ruby's built-in iterator methods:
class List
def initialize(arr)
#arr = arr
end
def each
i = 0
while i < #arr.length
yield #arr[i]
i += 1
end
end
end
Usage:
l = List.new(['a', 'b', 'c'])
l.each {|x| puts x}
Output:
a
b
c

I may come late to this topic, but I guess that VALUE array is a object/struct instead of an primitive value as int, char, etc. So, what rb_ary_each method is receiving is a pointer/reference to the object/struct of the type VALUE. If that is true (I'm just guessing here for what I know about C) then the block is passed inside the object/struct with the remain arguments.
I'm also guessing that another object/struct should exist to hold the block/proc/lambda, so the VALUE object/struct should hold a pointer/reference to the first one.
If I'm wrong, please, someone correct me! Your question are much more related to how C works than Ruby.

Related

Using range.each vs while-loop to work with sequence of numbers in Ruby

Total beginner here, so I apologize if a) this question isn't appropriate or b) I haven't asked it properly.
I'm working on simple practice problems in Ruby and I noticed that while I arrived at a solution that works, when my solution runs in a visualizer, it gives premature returns for the array. Is this problematic? I'm also wondering if there's any reason (stylistically, conceptually, etc.) why you would want to use a while-loop vs. a for-loop with range for a problem like this or fizzbuzz.
Thank you for any help/advice!
The practice problem is:
# Write a method which collects all numbers between small_num and big_num into
an array. Ex: range(2, 5) => [2, 3, 4, 5]
My solution:
def range(small_num, big_num)
arr = []
(small_num..big_num).each do |num|
arr.push(num)
end
return arr
end
The provided solution:
def range(small_num, big_num)
collection = []
i = small_num
while i <= big_num
collection << i
i += 1
end
collection
end
Here's a simplified version of your code:
def range(small_num, big_num)
arr = [ ]
(small_num..big_num).each do |num|
arr << num
end
arr
end
Where the << or push function does technically have a return value, and that return value is the modified array. This is just how Ruby works. Every method must return something even if that something is "nothing" in the form of nil. As with everything in Ruby even nil is an object.
You're not obligated to use the return values, though if you did want to you could. Here's a version with inject:
def range(small_num, big_num)
(small_num..big_num).inject([ ]) do |arr, num|
arr << num
end
end
Where the inject method takes the return value of each block and feeds it in as the "seed" for the next round. As << returns the array this makes it very convenient to chain.
The most minimal version is, of course:
def range(small_num, big_num)
(small_num..big_num).to_a
end
Or as Sagar points out, using the splat operator:
def range(small_num, big_num)
[*small_num..big_num]
end
Where when you splat something you're in effect flattening those values into the array instead of storing them in a sub-array.

Ruby Enumerable#find returning mapped value

Does Ruby's Enumerable offer a better way to do the following?
output = things
.find { |thing| thing.expensive_transform.meets_condition? }
.expensive_transform
Enumerable#find is great for finding an element in an enumerable, but returns the original element, not the return value of the block, so any work done is lost.
Of course there are ugly ways of accomplishing this...
Side effects
def constly_find(things)
output = nil
things.each do |thing|
expensive_thing = thing.expensive_transform
if expensive_thing.meets_condition?
output = expensive_thing
break
end
end
output
end
Returning from a block
This is the alternative I'm trying to refactor
def costly_find(things)
things.each do |thing|
expensive_thing = thing.expensive_transform
return expensive_thing if expensive_thing.meets_condition?
end
nil
end
each.lazy.map.find
def costly_find(things)
things
.each
.lazy
.map(&:expensive_transform)
.find(&:meets_condition?)
end
Is there something better?
Of course there are ugly ways of accomplishing this...
If you had a cheap operation, you'd just use:
collection.map(&:operation).find(&:condition?)
To make Ruby call operation only "on a as-needed basis" (as the documentation says), you can simply prepend lazy:
collection.lazy.map(&:operation).find(&:condition?)
I don't think this is ugly at all—quite the contrary— it looks elegant to me.
Applied to your code:
def costly_find(things)
things.lazy.map(&:expensive_transform).find(&:meets_condition?)
end
I would be inclined to create an enumerator that generates values thing.expensive_transform and then make that the receiver for find with meets_condition? in find's block. For one, I like the way that reads.
Code
def costly_find(things)
Enumerator.new { |y| things.each { |thing| y << thing.expensive_transform } }.
find(&:meets_condition?)
end
Example
class Thing
attr_reader :value
def initialize(value)
#value = value
end
def expensive_transform
self.class.new(value*2)
end
def meets_condition?
value == 12
end
end
things = [1,3,6,4].map { |n| Thing.new(n) }
#=> [#<Thing:0x00000001e90b78 #value=1>, #<Thing:0x00000001e90b28 #value=3>,
# #<Thing:0x00000001e90ad8 #value=6>, #<Thing:0x00000001e90ab0 #value=4>]
costly_find(things)
#=> #<Thing:0x00000001e8a3b8 #value=12>
In the example I have assumed that expensive_things and things are instances of the same class, but if that is not the case the code would need to be modified in the obvious way.
I don't think there is a "obvious best general solution" for your problem, which is also simple to use. You have two procedures involved (expensive_transform and meets_condition?), and you also would need - if this were a library method to use - as a third parameter the value to return, if no transformed element meets the condition. You return nil in this case, but in a general solution, expensive_transform might also yield nil, and only the caller knows what unique value would indicate that the condition as not been met.
Hence, a possible solution within Enumerable would have the signature
class Enumerable
def find_transformed(default_return_value, transform_proc, condition_proc)
...
end
end
or something similar, so this is not particularily elegant either.
You could do it with a single block, if you agree to merge the semantics of the two procedures into one: You have only one procedure, which calculates the transformed value and tests it. If the test succeeds, it returns the transformed value, and if it fails, it returns the default value:
class Enumerable
def find_by(default_value, &block)
result = default_value
each do |element|
result = block.call(element)
break if result != default_value
end
end
result
end
You would use it in your case like this:
my_collection.find_by(nil) do |el|
transformed_value = expensive_transform(el)
meets_condition?(transformed_value) ? transformed_value : nil
end
I'm not sure whether this is really intuitive to use...

How to use reduce/inject in Ruby without getting Undefined variable

When using an accumulator, does the accumulator exist only within the reduce block or does it exist within the function?
I have a method that looks like:
def my_useless_function(str)
crazy_letters = ['a','s','d','f','g','h']
str.split.reduce([]) do |new_array, letter|
for a in 0..crazy_letters.length-1
if letter == crazy_letters[a]
new_array << letter
end
end
end
return true if (new_array == new_array.sort)
end
When I execute this code I get the error
"undefined variable new_array in line 11 (the return statement)"
I also tried assigning the new_array value to another variable as an else statement inside my reduce block but that gave me the same results.
Can someone explain to me why this is happening?
The problem is that new_array is created during the call to reduce, and then the reference is lost afterwards. Local variables in Ruby are scoped to the block they are in. The array can be returned from reduce in your case, so you could use it there. However, you need to fix a couple things:
str.split does not break a string into characters in Ruby 2+. You should use str.chars, or str.split('').
The object retained for each new iteration of reduce must be retained by returning it from the block each time. The simplest way to do this is to put new_array as the last expression in your block.
Thus:
def my_useless_function(str)
crazy_letters = ['a','s','d','f','g','h']
crazy_only = str.split('').reduce([]) do |new_array, letter|
for a in 0..crazy_letters.length-1
if letter == crazy_letters[a]
new_array << letter
end
end
new_array
end
return true if (crazy_only == crazy_only.sort)
end
Note that your function is not very efficient, and not very idiomatic. Here's a shorter version of the function that is more idiomatic, but not much more efficient:
def my_useless_function(str)
crazy_letters = %w[a s d f g h]
crazy_only = str.chars.select{ |c| crazy_letters.include?(c) }
crazy_only == crazy_only.sort # evaluates to true or false
end
And here's a version that's more efficient:
def efficient_useless(str)
crazy_only = str.scan(/[asdfgh]/) # use regex to search for the letters you want
crazy_only == crazy_only.sort
end
Block local variables
new_array doesn't exist outside the block of your reduce call. It's a "block local variable".
reduce does return an object, though, and you should use it inside your method.
sum = [1, 2, 3].reduce(0){ |acc, elem| acc + elem }
puts sum
# 6
puts acc
# undefined local variable or method `acc' for main:Object (NameError)
Your code
Here's the least amount of change for your method :
def my_useless_function(str)
crazy_letters = ['a','s','d','f','g','h']
new_array = str.split(//).reduce([]) do |new_array, letter|
for a in 0..crazy_letters.length-1
if letter == crazy_letters[a]
new_array << letter
end
end
new_array
end
return true if (new_array == new_array.sort)
end
Notes:
return isn't needed at the end.
true if ... isn't needed either
for loop should never be used in Ruby
reduce returns the result of the last expression inside the block. It was for in your code.
If you always need to return the same object in reduce, it might be a sign you could use each_with_object.
"test".split is just ["test"]
String and Enumerable have methods that could help you. Using them, you could write a much cleaner and more efficient method, as in #Phrogz answer.

Self enumerating function

I've got some code:
def my_each_with_index
return enum_for(:my_each_with_index) unless block_given?
i = 0
self.my_each do |x|
yield x, i
i += 1
end
self
end
It is my own code, but the line:
return enum_for(:my_each_with_index) unless block_given?
is found in solutions of other's. I can't get why they passed the function to enum_for as a parameter. When I invoke my function without a block, it won't return anything with or without enum_for. I could left sth like:
return unless block_given?
and it has the same result. Or am I wrong?
Being called without a block, it will return an enumerator:
▶ def my_each_with_index
▷ return enum_for(:my_each_with_index) unless block_given?
▷ end
#⇒ :my_each_with_index
▶ e = my_each_with_index
#⇒ #<Enumerator: main:my_each_with_index>
later on you might iterate on this enumerator:
▶ e.each { |elem| ... }
This behavior is specifically useful in some cases, like lazy iteration, passing block to this enumerator later etc.
Just returning nil cuts this ability off.
Think you for very precise answer. I recived also very good example to understand this issue for other new developers:
def iterator
yield 1
yield 2
yield 3
puts "koniec"
end
iterator { |v| puts v }
it = enum_for(:iterator)
puts it.next
puts it.next
puts it.next
puts it.next
Just run and analyze this code.
For any method that accepts a block, a good method implementation should have a well-defined behavior when the block is not given.
In the example shared by you, each_for_index is being re-implemented by author, may be to provide additional semantics or may be just for academic purpose given that its behavior is same as Ruby's Enumerable#each_with_index.
The documentation has following for Enumerable#each_with_index.
Calls block with two arguments, the item and its index, for each item
in enum. Given arguments are passed through to each().
If no block is given, an enumerator is returned instead.
In order to stay consistent with highlighted line indicating what should be the behavior if block is not given, one has to use something like
return enum_for(:my_each_with_index) unless block_given?
enum_for is interesting method
enum_for creates a new Enumerator which will enumerate by calling method on obj.
Below is an example reproduced from documentation:
str = "xyz"
enum = str.enum_for(:each_byte)
enum.each { |b| puts b }
# => 120
# => 121
# => 122
So, if one does not pass block to my_each_with_index, they have a chance to pass it later - just like one would have done with each_with_index.
e = obj.my_each_with_index
...
e.each { |x, i| # do something } # `my_each_with_index` executed later
In summary, my_each_with_index tries to be consistent with each_with_index and tries to be a well-behaved API.

Some simple Ruby questions - iterators, blocks, and symbols

My background is in PHP and C#, but I'd really like to learn RoR. To that end, I've started reading the official documentation. I have some questions about some code examples.
The first is with iterators:
class Array
def inject(n)
each { |value| n = yield(n, value) }
n
end
def sum
inject(0) { |n, value| n + value }
end
def product
inject(1) { |n, value| n * value }
end
end
I understand that yield means "execute the associated block here." What's throwing me is the |value| n = part of the each. The other blocks make more sense to me as they seem to mimic C# style lambdas:
public int sum(int n, int value)
{
return Inject((n, value) => n + value);
}
But the first example is confusing to me.
The other is with symbols. When would I want to use them? And why can't I do something like:
class Example
attr_reader #member
# more code
end
In the inject or reduce method, n represents an accumulated value; this means the result of every iteration is accumulated in the n variable. This could be, as is in your example, the sum or product of the elements in the array.
yield returns the result of the block, which is stored in n and used in the next iterations. This is what makes the result "cumulative."
a = [ 1, 2, 3 ]
a.sum # inject(0) { |n, v| n + v }
# n == 0; n = 0 + 1
# n == 1; n = 1 + 2
# n == 3; n = 3 + 3
=> 6
Also, to compute the sum you could also have written a.reduce :+. This works for any binary operation. If your method is named symbol, writing a.reduce :symbol is the same as writing a.reduce { |n, v| n.symbol v }.
attr and company are actually methods. Under the hood, they dynamically define the methods for you. It uses the symbol you passed to work out the names of the instance variable and the methods. :member results in the #member instance variable and the member and member = methods.
The reason you can't write attr_reader #member is because #member isn't an object in itself, nor can it be converted to a symbol; it actually tells ruby to fetch the value of the instance variable #member of the self object, which, at class scope, is the class itself.
To illustrate:
class Example
#member = :member
attr_accessor #member
end
e = Example.new
e.member = :value
e.member
=> :value
Remember that accessing unset instance variables yields nil, and since the attr method family accepts only symbols, you get: TypeError: nil is not a symbol.
Regarding Symbol usage, you can sort of use them like strings. They make excellent hash keys because equal symbols always refer to the same object, unlike strings.
:a.object_id == :a.object_id
=> true
'a'.object_id == 'a'.object_id
=> false
They're also commonly used to refer to method names, and can actually be converted to Procs, which can be passed to methods. This is what allows us to write things like array.map &:to_s.
Check out this article for more interpretations of the symbol.
For the definition of inject, you're basically setting up chained blocks. Specifically, the variable n in {|value| n = yield(n, value)} is essentially an accumulator for the block passed to inject. So, for example, for the definition of product, inject(1) {|value| n * value}, let's assume you have an array my_array = [1, 2, 3, 4]. When you call my_array.product, you start by calling inject with n = 1. each yields to the block defined in inject, which in turns yields to the block passed to inject itself with n (1) and the first value in the array (1 as well, in this case). This block, {|n, value| n * value} returns 1 == 1 * 1, which is set it inject's n variable. Next, 2 is yielded from each, and the block defined in inject block yields as yield(1, 2), which returns 2 and assigns it to n. Next 3 is yielded from each, the block yields the values (2, 3) and returns 6, which is stored in n for the next value, and so forth. Essentially, tracking the overall value agnostic of the calculation being performed in the specialised routines (sum and product) allows for generalization. Without that, you'd have to declare e.g.
def sum
n = 0
each {|val| n += val}
end
def product
n = 1
each {|val| n *= val}
end
which is annoyingly repetitive.
For your second question, attr_reader and its family are themselves methods that are defining the appropriate accessor routines using define_method internally, in a process called metaprogramming; they are not language statements, but just plain old methods. These functions expect to passed a symbol (or, perhaps, a string) that gives the name of the accessors you're creating. You could, in theory, use instance variables such as #member here, though it would be the value to which #member points that would be passed in and used in define_method. For an example of how these are implemented, this page shows some examples of attr_* methods.
def inject(accumulator)
each { |value| accumulator = yield(accumulator, value) }
accumulator
end
This is just yielding the current value of accumulator and the array item to inject's block and then storing the result back into accumulator again.
class Example
attr_reader #member
end
attr_reader is just a method whose argument is the name of the accessor you want to setup. So, in a contrived way you could do
class Example
#ivar_name = 'foo'
attr_reader #ivar_name
end
to create an getter method called foo
Your confusion with the first example may be due to your reading |value| n as a single expression, but it isn't.
This reformatted version might be clearer to you:
def inject(n)
each do |value|
n = yield(n, value)
end
return n
end
value is an element in the array, and it is yielded with n to whatever block is passed to inject, the result of which is set to n. If that's not clear, read up on the each method, which takes a block and yields each item in the array to it. Then it should be clearer how the accumulation works.
attr_reader is less weird when you consider that it is a method for generating accessor methods. It's not an accessor in itself. It doesn't need to deal with the #member variable's value, just its name. :member is just the interned version of the string 'member', which is the name of the variable.
You can think of symbols as lighter weight strings, with the additional bonus that every equal label is the same object - :foo.object_id == :foo.object_id, whereas 'foo'.object_id != 'foo'.object_id, because each 'foo' is a new object. You can try that for yourself in irb. Think of them as labels, or primitive strings. They're surprisingly useful and come up a lot, e.g. for metaprogramming or as keys in hashes. As pointed out elsewhere, calling object.send :foo is the same as calling object.foo
It's probably worth reading some early chapters from the 'pickaxe' book to learn some more ruby, it will help you understand and appreciate the extra stuff rails adds.
First you need to understand where to use symbols and where its not..
Symbol is especially used to represent something. Ex: :name, :age like that. Here we are not going to perform any operations using this.
String are used only for data processing. Ex: 'a = name'. Here I gonna use this variable 'a' further for other string operations in ruby.
Moreover, symbol is more memory efficient than strings and it is immutable. That's why ruby developer's prefers symbols than string.
You can even use inject method to calculate sum as (1..5).to_a.inject(:+)

Resources