Using yield within a select statement - ruby

If I wanted to prune an array by a given set of parameters I would write something like this:
array = [4,5,6,7,8]
a = array.select{|i| i>=5}
puts a.inspect
which would return [5,6,7,8].
I want to write a function "filter" which accomplishes the same thing. In this case my first thought is to write something like:
array = [4,5,6,7,8]
a = filter(array) {|i| i >= 5}
puts a.inspect
What I can't figure out is how to properly call yield within the method to invoke the code block during the select statement:
a = array.select{yield}
Doesn't seem to work since it attempts to call the code block on nil, not the array within the function. What's the proper way of doing this?

Don't know if it makes sense for you, but try:
def filter(array)
array.select { |i| yield(i) }
end
array = [4,5,6,7,8]
p filter(array) {|i| i >= 5}

When you write code within braces {...} to be passed to a method, this code is called a block. It is normally passed implicitly to a method (i.e. it is not a named argument). To invoke this implicit block, you call yield.
In your case, you don't want to invoke the block yourself; you want your filter method to pass the block along to select, where the actual filtering takes place.
To "pass along a block", you can make the method's block argument explicit by using the & prefix. Note that the name block in this example is just convention; there is no special block keyword. The important part is the & character:
def filter(array, &block)
array.select(&block)
end
array = [4,5,6,7,8]
filter(array) { |i| i >= 5 } # => [5,6,7,8]

Related

can somebody explain how does the following code execute?

I am following a linked tutorial from the Odin project, its about blocks and procs in ruby. I can't quite understand how does the following code work.
class Array
def eachEven(&wasABlock_nowAProc)
# We start with "true" because arrays start with 0, which is even.
isEven = true
self.each do |object|
if isEven
wasABlock_nowAProc.call object
end
isEven = (not isEven) # Toggle from even to odd, or odd to even.
end
end
end
['apple', 'bad apple', 'cherry', 'durian'].eachEven do |fruit|
puts 'Yum! I just love '+fruit+' pies, don\'t you?'
end
# Remember, we are getting the even-numbered elements
# of the array, all of which happen to be odd numbers,
# just because I like to cause problems like that.
[1, 2, 3, 4, 5].eachEven do |oddBall|
puts oddBall.to_s+' is NOT an even number!'
end
Is ['apple', 'bad apple', 'cherry', 'durian'] a block in this context and are we calling the method isEven on that block?
Does isEven used to only return true or false and if true the following code will be executed?
do |fruit|
puts 'Yum! I just love '+fruit+' pies, don\'t you?'
end
Also, what is this line doing?
self.each do |object|
if isEven
wasABlock_nowAProc.call object
end
end
If isEven is true then call [1, 2, 3, 4, 5] with the object??? What does calling that block with object mean?
Let's do it in parts:
1)The class Array was native from ruby, which means we are adding a method to all instances of Array, the method is the eachEven.
2) This method receives as parameter a block to be executed, keep this information in mind.
3) The ["apple", "bad apple", "cherry"] is an instance from Array, which means that we can execute the method eachEven for this array:
array = ["apple", "bad apple", "cherry"]
array.eachEven do |something|
# The do/end block is the parameter passed to the method `eachEven`
# the block will be binded in `wasABlock_nowAProc` in this case
end
4) Inside the method eachEven we get the self (self is the array itself) and execute another method from the Array instance: each (this method iterate over the array binding the current position to the variable inside brackets: |object|)
5) If the condition returns a positive result, it will execute the block inside if, in the case:
wasABlock_nowAProc.call object
# We execute the block of step 2 passing the current position value as a parameter
In fact, if we execute the following code:
array = [1, 2, 3, 4]
array.eachEven do |position_value|
puts "The #{position_value} is even"
end
We gonna get the following result:
The 1 is even # The block `wasABlock_nowAProc` will bind the 1 to the object and print it
The 3 is even # Same here, 3 will be used as the object in the execution of `wasABlock_nowAProc`
Hope it helps
Let's break apart the code here:
['apple', 'bad apple', 'cherry', 'durian'].eachEven do |fruit|
puts 'Yum! I just love '+fruit+' pies, don\'t you?'
end
What we have here boils down to:
receiver.method do |block_argument_one|
# this is the _body_ of the _block_
end
So:
['apple', 'bad apple', 'cherry', 'durian'] is called the receiver (or subject, or just object or instance)
eachEven is the method being called on the receiver
Everything from do to end is the block. It could also be { to } and work the same (well, mostly)
|fruit| is the block arguments list, with fruit being the only argument the block cares about.
puts … is the body of the block
What happens to the block is:
The code in the block gets interpreted, but not run
A placeholder for that code is passed to the method the block is attached to
the method runs, and can access the block while running
Now lets look at how a method that takes a block works:
class SomeClass
def some_method(regular_argument, &block_capture_argument)
# method body
# explicitly call the block:
block_capture_argument.call("first value passed to block")
# implicitly call the block (same as above)
yield "first value passed to block"
end
end
This shows several ways a block can be used:
When you define a method with the last argument beginning with &, a reference to the block is made available to the method by the name after the & (your wasABlock_nowAProc argument, for example). Then your method can do what is likes with the block, maybe calling it, or maybe even storing it somewhere a completely different method can use it.
Alternatively, you can use the yield keyword to call the block implicitly. In that case, you don't need a & argument to the method (but it still works if you do have that argument). Note that ruby allows you to attach a block to any method, regardless of if it uses that block. Methods can check if there was a block with the keyword block_given?, or check that the value of the & argument is present.
When you call the block, either with yield or with call, arguments you give to the call method are passed as arguments to the block.
The method can do whatever it wants with the block. It can call it once, twice, 0, or 300 times. It can call it with the same arguments each time or with different arguments each time.
In your specific example, the block gets called (with the value of object) for each item in the receiver, but only if the isEven variable is true.
Also in your specific example, you are calling the block from inside another block (which provides object for you), but don't let that confuse you.
To summarize:
blocks can be attached to any method using either do … end or {…}
blocks don't run unless the method they are attached to decides to call them
methods get called on a receiver
methods that use blocks get to decide how and when to use them
methods that use blocks can call blocks (or use yield) and pass any number of arguments to the block.
blocks can be defined to use those arguments (with the |…| syntax), and can name those arguments whatever they want (what matters is the order/position of the arguments).

How does a code block in Ruby know what variable belongs to an aspect of an object?

Consider the following:
(1..10).inject{|memo, n| memo + n}
Question:
How does n know that it is supposed to store all the values from 1..10? I'm confused how Ruby is able to understand that n can automatically be associated with (1..10) right away, and memo is just memo.
I know Ruby code blocks aren't the same as the C or Java code blocks--Ruby code blocks work a bit differently. I'm confused as to how variables that are in between the upright pipes '|' will automatically be assigned to parts of an object. For example:
hash1 = {"a" => 111, "b" => 222}
hash2 = {"b" => 333, "c" => 444}
hash1.merge(hash2) {|key, old, new| old}
How do '|key, old, new|' automatically assign themselves in such a way such that when I type 'old' in the code block, it is automatically aware that 'old' refers to the older hash value? I never assigned 'old' to anything, just declared it. Can someone explain how this works?
The parameters for the block are determined by the method definition. The definition for reduce/inject is overloaded (docs) and defined in C, but if you wanted to define it, you could do it like so (note, this doesn't cover all the overloaded cases for the actual reduce definition):
module Enumerable
def my_reduce(memo=nil, &blk)
# if a starting memo is not given, it defaults to the first element
# in the list and that element is skipped for iteration
elements = memo ? self : self[1..-1]
memo ||= self[0]
elements.each { |element| memo = blk.call(memo, element) }
memo
end
end
This method definition determines what values to use for memo and element and calls the blk variable (a block passed to the method) with them in a specific order.
Note, however, that blocks are not like regular methods, because they don't check the number of arguments. For example: (note, this example shows the usage of yield which is another way to pass a block parameter)
def foo
yield 1
end
# The b and c variables here will be nil
foo { |a, b, c| [a,b,c].compact.sum } # => 1
You can also use deconstruction to define variables at the time you run the block, for example if you wanted to reduce over a hash you could do something like this:
# this just copies the hash
{a: 1}.reduce({}) { |memo, (key, val)| memo[key] = val; memo }
How this works is, calling reduce on a hash implicitly calls to_a, which converts it to a list of tuples (e.g. {a: 1}.to_a = [[:a, 1]]). reduce passes each tuple as the second argument to the block. In the place where the block is called, the tuple is deconstructed into separate key and value variables.
A code block is just a function with no name. Like any other function, it can be called multiple times with different arguments. If you have a method
def add(a, b)
a + b
end
How does add know that sometimes a is 5 and sometimes a is 7?
Enumerable#inject simply calls the function once for each element, passing the element as an argument.
It looks a bit like this:
module Enumerable
def inject(memo)
each do |el|
memo = yield memo, el
end
memo
end
end
And memo is just memo
what do you mean, "just memo"? memo and n take whatever values inject passes. And it is implemented to pass accumulator/memo as first argument and current collection element as second argument.
How do '|key, old, new|' automatically assign themselves
They don't "assign themselves". merge assigns them. Or rather, passes those values (key, old value, new value) in that order as block parameters.
If you instead write
hash1.merge(hash2) {|foo, bar, baz| bar}
It'll still work exactly as before. Parameter names mean nothing [here]. It's actual values that matter.
Just to simplify some of the other good answers here:
If you are struggling understanding blocks, an easy way to think of them is as a primitive and temporary method that you are creating and executing in place, and the values between the pipe characters |memo| is simply the argument signature.
There is no special special concept behind the arguments, they are simply there for the method you are invoking to pass a variable to, like calling any other method with an argument. Similar to a method, the arguments are "local" variables within the scope of the block (there are some nuances to this depending on the syntax you use to call the block, but I digress, that is another matter).
The method you pass the block to simply invokes this "temporary method" and passes the arguments to it that it is designed to do. Just like calling a method normally, with some slight differences, such as there are no "required" arguments. If you do not define any arguments to receive, it will happily just not pass them instead of raising an ArgumentError. Likewise, if you define too many arguments for the block to receive, they will simply be nil within the block, no errors for not being defined.

Simple way to understand returning from a block in ruby

My code is supposed to print integers in an array.
odds_n_ends = [:weezard, 42, "Trady Blix", 3, true, 19, 12.345]
ints = odds_n_ends.select { |x| if x.is_a?(Integer) then return x end }
puts ints
It gives me an error in the 2nd line - in 'block in <main>': unexpected return (LocalJumpError)
When I remove the return, the code works exactly as desired.
To find the mistake in my understanding of blocks, I read related posts post1 and post2. But, I am not able to figure out how exactly are methods and blocks being called and why my approach is incorrect.
Is there some call stack diagram explanation for this ? Any simple explanation ?
I am confused because I have only programmed in Java before.
You generally don't need to worry exactly what blocks are to use them.
In this situation, return will return from the outside scope, e.g. if these lines were in a method, then from that method. It's the same as if you put a return statement inside a loop in Java.
Additional tips:
select is used to create a copied array where only the elements satisfying the condition inside the block are selected:
only_ints = odds_n_ends.select { |x| x.is_a?(Integer) }
You're using it as a loop to "pass back" variables that are integers, in which case you'd do:
only_ints = []
odds_n_ends.each { |x| if x.is_a?(Integer) then only_ints << x end }
If you try to wrap your code in a method then it won't give you an error:
def some_method
odds_n_ends = [:weezard, 42, "Trady Blix", 3, true, 19, 12.345]
ints = odds_n_ends.select { |x| if x.is_a?(Integer) then return true end }
puts ints
end
puts some_method
This code output is true. But wait, where's puts ints??? Ruby didn't reach that. When you put return inside a Proc, then you're returning in the scope of the entire method. In your example, you didn't have any method in which you put your code, so after it encountered 'return', it didn't know where to 'jump to', where to continue to.
Array#select basically works this way: For each element of the array (represented with |x| in your code), it evaluates the block you've just put in and if the block evaluates to true, then that element will be included in the new array. Try removing 'return' from the second line and your code will work:
ints = odds_n_ends.select { |x| if x.is_a?(Integer) then true end }
However, this isn't the most Ruby-ish way, you don't have to tell Ruby to explicitly return true. Blocks (the code between the {} ) are just like methods, with the last expression being the return value of the method. So this will work just as well:
ints = odds_n_ends.select { |x| if x.is_a?(Integer) } # imagine the code between {} is
#a method, just without name like 'def is_a_integer?' with the value of the last expression
#being returned.
Btw, there's a more elegant way to solve your problem:
odds_n_ends = [:weezard, 42, "Trady Blix", 3, true, 19, 12.345]
ints = odds_n_ends.grep(Integer)
puts ints
See this link. It basically states:
Returns an array of every element in enum for which Pattern ===
element.
To understand Pattern === element, simply imagine that Pattern is a set (let's say a set of Integers). Element might or might not be an element of that set (an integer). How to find out? Use ===. If you type in Ruby:
puts Integer === 34
it will evalute to true. If you put:
puts Integer === 'hey'
it will evalute to false.
Hope this helped!
In ruby a method always returns it's last statement, so in generall you do not need to return unless you want to return prematurely.
In your case you do not need to return anything, as select will create a new array with just the elements that return true for the given block. As ruby automatically returns it's last statement using
{ |x| x.is_a?(Integer) }
would be sufficient. (Additionally you would want to return true and not x if you think about "return what select expects", but as ruby treats not nil as true it also works...)
Another thing that is important is to understand a key difference of procs (& blocks) and lambdas which is causing your problem:
Using return in a Proc will return the method the proc is used in.
Using return in a Lambdas will return it's value like a method.
Think of procs as code pieces you inject in a method and of lambdas as anonymous methods.
Good and easy to comprehend read: Understanding Ruby Blocks, Procs and Lambdas
When passing blocks to methods you should simply put the value you want to be returned as the last statement, which can also be in an if-else clause and ruby will use the last actually reached statement.

Yield within Set to eliminate in an Array

I found the following code here for eliminating duplicate records in an array:
require 'set'
class Array
def uniq_by
seen = Set.new
select{ |x| seen.add?( yield( x ) ) }
end
end
And we can use the code above as follows:
#messages = Messages.all.uniq_by { |h| h.body }
I would like to know how and what happens when the method is called. Can someone explain the internals of the code above? In the uniq_by method, we did not do anything to handle block argument. How is the passed argument handled by uniq_by method?
Let's break it down :
seen = Set.new
Create an empty set
select{ |x| seen.add?( yield( x ) ) }
Array#select will keep elements when the block yields true.
seen.add?(yield(x)) will return true if the result of the block can be added in the set, or false if it can't.
Indeed, yield(x) will call the block passed to the uniq_by method, and pass x as an argument.
In our case, since our block is { |h| h.body }, it would be the same as calling seen.add?(x.body)
Since a set is unique, calling add? when the element already exists will return false.
So it will try to call .body on each element of the array and add it in a set, keeping elements where the adding was possible.
The method uniq_by accepts a block argument. This allows to specify, by what criteria you wish to identify two elements as "unique".
The yield statement will evaluate the value of the given block for the element and return the value of the elements body attribute.
So, if you call unique_by like above, you are stating that the attribute body of the elements has to be unique for the element to be unique.
To answer the more specific question you have: yield will call the passed block {|h| h.body} like a method, substituting h for the current x and therefore return x.body
In Ruby, when you are putting yield keyword inside any method(say #bar), you are explicitly telling #bar that, you will be using a block with the method #bar. So yield knows, inside the method block will be converted to a Proc object, and yield have to call that Proc object.
Example :
def bar
yield
end
p bar { "hello" } # "hello"
p bar # bar': no block given (yield) (LocalJumpError)
In the uniq_by method, we did not do anything to handle block argument. How is the passed argument handled by uniq_by method?
You did do, that is you put yield. Once you will put this yield, now method is very smart to know, what it supposed to so. In the line Messages.all.uniq_by { |h| h.body } you are passing a block { |h| h.body }, and inside the method definition of uniq_by, that block has been converted to a Proc object, and yield does Proc#call.
Proof:
def bar
p block_given? # true
yield
end
bar { "hello" } # "hello"
Better for understanding :
class Array
def uniq_by
seen = Set.new
select{ |x| seen.add?( yield( x ) ) }
end
end
is same as
class Array
def uniq_by
seen = Set.new
# Below you are telling uniq_by, you will be using a block with it
# by using `yield`.
select{ |x| var = yield(x); seen.add?(var) }
end
end
Read the doc of yield
Called from inside a method body, yields control to the code block (if any) supplied as part of the method call. If no code block has been supplied, calling yield raises an exception. yield can take an argument; any values thus yielded are bound to the block's parameters. The value of a call to yield is the value of the executed code block.
Array#select returns a new array containing all elements of the array for which the given block returns a true value.
The block argument of the select use Set#add? to determine whether the element is already there. add? returns nil if there is already the same element in the set, otherwise it returns the set itself and add the element to the set.
The block again pass the argument (an element of the array) to another block (the block passed to the uniq_by) using yield; Return value of the yield is return value of the block ({|h| h.body })
The select .. statement is basically similar to following statement:
select{ |x| seen.add?(x.body) }
But by using yield, the code avoid hard-coding of .body, and defers decision to the block.

Accessing a passed block in Ruby

I have a method that accepts a block, lets call it outer. It in turn calls a method that accepts another block, call it inner.
What I would like to have happen is for outer to call inner, passing it a new block which calls the first block.
Here's a concrete example:
class Array
def delete_if_index
self.each_with_index { |element, i| ** A function that removes the element from the array if the block passed to delete_if_index is true }
end
end
['a','b','c','d'].delete_if_index { |i| i.even? }
=> ['b','d']
the block passed to delete_if_index is called by the block passed to each_with_index.
Is this possible in Ruby, and, more broadly, how much access do we have to the block within the function that receives it?
You can wrap a block in another block:
def outer(&block)
if some_condition_is_true
wrapper = lambda {
p 'Do something crazy in this wrapper'
block.call # original block
}
inner(&wrapper)
else
inner(&passed_block)
end
end
def inner(&block)
p 'inner called'
yield
end
outer do
p 'inside block'
sleep 1
end
I'd say opening up an existing block and changing its contents is Doing it WrongTM, maybe continuation-passing would help here? I'd also be wary of passing around blocks with side-effects; I try and keep lambdas deterministic and have actions like deleting stuff in the method body. In a complex application this will likely make debugging a lot easier.
Maybe the example is poorly chosen, but your concrete example is the same as:
[1,2,3,4].reject &:even?
Opening up and modifying a block strikes me as code smell. It'd be difficult to write it in a way that makes the side effects obvious.
Given your example, I think a combination of higher order functions will do what you're looking to solve.
Update: It's not the same, as pointed out in the comments. [1,2,3,4].reject(&:even?) looks at the contents, not the index (and returns [1,3], not [2,4] as it would in the question). The one below is equivalent to the original example, but isn't vary pretty.
[1,2,3,4].each_with_index.reject {|element, index| index.even? }.map(&:first)
So here's a solution to my own question. The passed in block is implicitly converted into a proc which can be received with the & parameter syntax. The proc then exists inside the closure of any nested block, as it is assigned to a local variable in scope, and can be called by it:
class Array
def delete_if_index(&proc)
ary = []
self.each_with_index { |a, i| ary << self[i] unless proc.call(i) }
ary
end
end
[0,1,2,3,4,5,6,7,8,9,10].delete_if_index {|index| index.even?}
=> [1, 3, 5, 7, 9]
Here the block is converted into a proc, and assigned to the variable proc, which is then available within the block passed to each_with_index.

Resources