What happens with method after `end` like `end.compact`? - ruby

In this code,
working_days = open(ARGV[0].to_s,'r').each_line.map do |line|
do_something
end.compact
the map function returns an array [1, nil, 3, nil]. I appended compact to the keyword end. I want to know what is behind the scene. After I add compact, does the return array become:
[1] → [1,3] or
[1] → [1, nil] → [1, nil, 3] → [1, nil, 3, nil] → [1, 3]
How can I use pry to inspect every step?
Will compact be sent with the do end block into map function ?

There's no magic here. Your code is identical to the following:
tmp = open(...).each_line.map do |line|
do_something
end
working_days = tmp.compact
You've simply removed the middle step, assigning the return value of map to a temporary variable.
It's the difference between doing this...
a(b(c()))
and doing this:
tmp = c()
tmp = b(tmp)
tmp = a(tmp)
You're simply invoking a function on a return value directly, rather than using a second statement.

I don't understand why you are thinking it in a complicated way. It is done as:
[1, nil, 3, nil] → [1, 3]
There are no intermediate steps that you can observe.

Related

How to return a splat from a method in Ruby

I wanted to create a method for array's to get a splat of the array in return. Is this possible to do in Ruby?
For example here's my current code:
Array.module_eval do
def to_args
return *self
end
end
I expect [1,2,3].to_args to return 1,2,3 but it ends up returning [1,2,3]
You cannot return a "splat" from Ruby. But you can return an array and then splat it yourself:
def args
[1, 2, 3]
end
x, y, z = args
# x == 1
# y == 2
# z == 3
x, *y = args
# x == 1
# y == [2, 3]
Of course, this works on any array, so really there is no need for monkey patching a to_args method into Array - it's all about how the calling concern is using the splat operator:
arr = [1, 2, 3]
x, y, z = arr
x, *y = arr
*x, y = arr
Same mechanism works with block arguments:
arr = [1, 2, 3]
arr.tap {|x, *y| y == [2, 3]}
Even more advanced usage:
arr = [1, [2, 3]]
x, (y, z) = arr
The concept that clarifies this for me is that although you can simulate the return of multiple values in Ruby, a method can really return only 1 object, so that simulation bundles up the multiple values in an Array.
The array is returned, and you can then deconstruct it, as you can any array.
def foo
[1, 2]
end
one, two = foo
Not exactly. What it looks like you're trying to do (the question doesn't give usage examples) is to force multiple return values. However, returning the splatted array self may do exactly what you need, as long as you're properly handling multiple return values on the calling side of the equation.
Consider these examples:
first, *rest = [1, 2, 3] # first = 1, rest = [2, 3]
*rest, last = [1, 2, 3] # rest = [1, 2], last = 3
first, *rest, last = [1, 2, 3] # first = 1, rest = [2], last = 3
Other than this, I can't actually see any way to capture or pass along multiple values like you're suggesting. I think the answer for your question, if I understand it correctly, is all in the caller's usage.

Block with two parameters

I found this code by user Hirolau:
def sum_to_n?(a, n)
a.combination(2).find{|x, y| x + y == n}
end
a = [1, 2, 3, 4, 5]
sum_to_n?(a, 9) # => [4, 5]
sum_to_n?(a, 11) # => nil
How can I know when I can send two parameters to a predefined method like find? It's not clear to me because sometimes it doesn't work. Is this something that has been redefined?
If you look at the documentation of Enumerable#find, you see that it accepts only one parameter to the block. The reason why you can send it two, is because Ruby conveniently lets you do this with blocks, based on it's "parallel assignment" structure:
[[1,2,3], [4,5,6]].each {|x,y,z| puts "#{x}#{y}#{z}"}
# 123
# 456
So basically, each yields an array element to the block, and because Ruby block syntax allows "expanding" array elements to their components by providing a list of arguments, it works.
You can find more tricks with block arguments here.
a.combination(2) results in an array of arrays, where each of the sub array consists of 2 elements. So:
a = [1,2,3,4]
a.combination(2)
# => [[1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4]]
As a result, you are sending one array like [1,2] to find's block, and Ruby performs the parallel assignment to assign 1 to x and 2 to y.
Also see this SO question, which brings other powerful examples of parallel assignment, such as this statement:
a,(b,(c,d)) = [1,[2,[3,4]]]
find does not take two parameters, it takes one. The reason the block in your example takes two parameters is because it is using destruction. The preceding code a.combination(2) gives an array of arrays of two elements, and find iterates over it. Each element (an array of two elements) is passed at a time to the block as its single parameter. However, when you write more parameters than there is, Ruby tries to adjust the parameters by destructing the array. The part:
find{|x, y| x + y == n}
is a shorthand for writing:
find{|(x, y)| x + y == n}
The find function iterates over elements, it takes a single argument, in this case a block (which does take two arguments for a hash):
h = {foo: 5, bar: 6}
result = h.find {|k, v| k == :foo && v == 5}
puts result.inspect #=> [:foo, 5]
The block takes only one argument for arrays though unless you use destructuring.
Update: It seems that it is destructuring in this case.

How does to_enum(:method) receive its block here?

This code, from an example I found, counts the number of elements in the array which are equal to their index. But how ?
[4, 1, 2, 0].to_enum(:count).each_with_index{|elem, index| elem == index}
I could not have done it only with chaining, and the order of evaluation within the chain is confusing.
What I understand is we're using the overload of Enumerable#count which, if a block is given, counts the number of elements yielding a true value. I see that each_with_index has the logic for whether the item is equal to it's index.
What I don't understand is how each_with_index becomes the block argument of count, or why the each_with_index works as though it was called directly on [4,1,2,0]. If map_with_index existed, I could have done:
[4,1,2,0].map_with_index{ |e,i| e==i ? e : nil}.compact
but help me understand this enumerable-based style please - it's elegant!
Let's start with a simpler example:
[4, 1, 2, 0].count{|elem| elem == 4}
=> 1
So here the count method returns 1 since the block returns true for one element of the array (the first one).
Now let's look at your code. First, Ruby creates an enumerator object when we call to_enum:
[4, 1, 2, 0].to_enum(:count)
=> #<Enumerator: [4, 1, 2, 0]:count>
Here the enumerator is waiting to execute the iteration, using the [4, 1, 2, 0] array and the count method. Enumerators are like a pending iteration, waiting to happen later.
Next, you call the each_with_index method on the enumerator, and provide a block:
...each_with_index{|elem, index| elem == index}
This calls the Enumerator#each_with_index method on the enumerator object you created above. What Enumerator#each_with_index does is start the pending iteration, using the given block. But it also passes an index value to the block, along with the values from the iteration. Since the pending iteration was setup to use the count method, the enumerator will call Array#count. This passes each element from the array back to the enumerator, which passes them into the block along with the index. Finally, Array#count counts up the true values, just like with the simpler example above.
For me the key to understanding this is that you're using the Enumerator#each_with_index method.
The answer is but a click away: the documentation for Enumerator:
Most [Enumerator] methods [but presumably also Kernel#to_enum and Kernel#enum_for] have two forms: a block form where the contents are evaluated for each item in the enumeration, and a non-block form which returns a new Enumerator wrapping the iteration.
It is the second that applies here:
enum = [4, 1, 2, 0].to_enum(:count) # => #<Enumerator: [4, 1, 2, 0]:count>
enum.class # => Enumerator
enum_ewi = enum.each_with_index
# => #<Enumerator: #<Enumerator: [4, 1, 2, 0]:count>:each_with_index>
enum_ewi.class # => Enumerator
enum_ewi.each {|elem, index| elem == index} # => 2
Note in particular irb's return from the third line. It goes on say, "This allows you to chain Enumerators together." and gives map.with_index as an example.
Why stop here?
enum_ewi == enum_ewi.each.each.each # => true
yet_another = enum_ewi.each_with_index
# => #<Enumerator: #<Enumerator: #<Enumerator: [4, 1, 2, 0]:count>:each_with_index>:each_with_index>
yet_another.each_with_index {|e,i| puts "e = #{e}, i = #{i}"}
e = [4, 0], i = 0
e = [1, 1], i = 1
e = [2, 2], i = 2
e = [0, 3], i = 3
yet_another.each_with_index {|e,i| e.first.first == i} # => 2
(Edit 1: replaced example from docs with one pertinent to the question. Edit 2: added "Why stop here?)
Nice answer #Cary.. I'm not exactly sure how the block makes its way through the chain of objects, but despite appearances, the block is being executed by the count method, as in this stack trace, even though its variables are bound to those yielded by each_with_index
enum = [4, 1, 2, 0].to_enum(:count)
enum.each_with_index{|e,i| raise "--" if i==3; puts e; e == i}
4
1
2
RuntimeError: --
from (irb):243:in `block in irb_binding'
from (irb):243:in `count'
from (irb):243:in `each_with_index'
from (irb):243

Making map! enumerator do what I want

Having an array
a = 1, 2, 3, 4
And an enumerator:
e = a.map!
Then, by calling e.next repeatedly, array a gets nicely destroyed:
e.next #=> 1
a #=> [1, 2, 3, 4]
e.next #=> 2
a #=> [nil, 2, 3, 4]
e.next #=> 3
a #=> [nil, nil, 3, 4]
That's so hilarious! But when I try
e.next { |x| 2 * x } # => 4
I get
a #=> [nil, nil, nil, 4]
instead of desired
a #=> [nil, nil, nil, 8]
What am I misunderstanding? How to make a.map! do what I want with the elements?
My problem is, that I do not fully understand enumerators. With the previous code in place, for example, enumerator e constitutes a backdoor to a:
e.each { 42 }
a #=> [42, 42, 42, 42]
I would like to know, how to do this gradually, with values other than nil. (I can gradually fill it with nils using e.rewind and e.next several times, as I shown before.
To make map! behave as you want, you need the Enumerator#feed method, consider this
ary = *1..4
enum = ary.map!
# the `loop` method handles `StopIteration` for us
loop do
x = enum.next
enum.feed(x * 2)
end
ary
# => [2, 4, 6, 8]
From reference it seems that Enumerator#next doesn't accept a block, so that doesn't have effect of your next call. If you just want to in-place double the last element while clearing all other, do something like, consider straight approach (like a = a[0..-2].map!{|x| nil} + [a.last*2], maybe more elegant). Anyway, could you please provide us with a more detailed usecase to make sure you are doing what you really need?
a.map! accepts a block, but returns an enumerator if no block is supplied. Enumerator#next does not accept a block.
You want to use this to accomplish your goal:
a.map! {|x| x * 2}
if you want to multiply all elements in the array by 2.
For info on next, check out http://ruby-doc.org/core-2.0/Enumerator.html#method-i-next
If you want the output to be exactly [nil, nil, nil, 8] you could do something like:
func = lambda { |x|
unless x == 4
nil
else
x * 2
end
}
a.map!(&func) #> [nil, nil, nil, 8]

How to map and remove nil values in Ruby

I have a map which either changes a value or sets it to nil. I then want to remove the nil entries from the list. The list doesn't need to be kept.
This is what I currently have:
# A simple example function, which returns a value or nil
def transform(n)
rand > 0.5 ? n * 10 : nil }
end
items.map! { |x| transform(x) } # [1, 2, 3, 4, 5] => [10, nil, 30, 40, nil]
items.reject! { |x| x.nil? } # [10, nil, 30, 40, nil] => [10, 30, 40]
I'm aware I could just do a loop and conditionally collect in another array like this:
new_items = []
items.each do |x|
x = transform(x)
new_items.append(x) unless x.nil?
end
items = new_items
But it doesn't seem that idiomatic. Is there a nice way to map a function over a list, removing/excluding the nils as you go?
You could use compact:
[1, nil, 3, nil, nil].compact
=> [1, 3]
I'd like to remind people that if you're getting an array containing nils as the output of a map block, and that block tries to conditionally return values, then you've got code smell and need to rethink your logic.
For instance, if you're doing something that does this:
[1,2,3].map{ |i|
if i % 2 == 0
i
end
}
# => [nil, 2, nil]
Then don't. Instead, prior to the map, reject the stuff you don't want or select what you do want:
[1,2,3].select{ |i| i % 2 == 0 }.map{ |i|
i
}
# => [2]
I consider using compact to clean up a mess as a last-ditch effort to get rid of things we didn't handle correctly, usually because we didn't know what was coming at us. We should always know what sort of data is being thrown around in our program; Unexpected/unknown data is bad. Anytime I see nils in an array I'm working on, I dig into why they exist, and see if I can improve the code generating the array, rather than allow Ruby to waste time and memory generating nils then sifting through the array to remove them later.
'Just my $%0.2f.' % [2.to_f/100]
Try using reduce or inject.
[1, 2, 3].reduce([]) { |memo, i|
if i % 2 == 0
memo << i
end
memo
}
I agree with the accepted answer that we shouldn't map and compact, but not for the same reasons.
I feel deep inside that map then compact is equivalent to select then map. Consider: map is a one-to-one function. If you are mapping from some set of values, and you map, then you want one value in the output set for each value in the input set. If you are having to select before-hand, then you probably don't want a map on the set. If you are having to select afterwards (or compact) then you probably don't want a map on the set. In either case you are iterating twice over the entire set, when a reduce only needs to go once.
Also, in English, you are trying to "reduce a set of integers into a set of even integers".
Ruby 2.7+
There is now!
Ruby 2.7 is introducing filter_map for this exact purpose. It's idiomatic and performant, and I'd expect it to become the norm very soon.
For example:
numbers = [1, 2, 5, 8, 10, 13]
enum.filter_map { |i| i * 2 if i.even? }
# => [4, 16, 20]
In your case, as the block evaluates to falsey, simply:
items.filter_map { |x| process_x url }
"Ruby 2.7 adds Enumerable#filter_map" is a good read on the subject, with some performance benchmarks against some of the earlier approaches to this problem:
N = 100_000
enum = 1.upto(1_000)
Benchmark.bmbm do |x|
x.report("select + map") { N.times { enum.select { |i| i.even? }.map{ |i| i + 1 } } }
x.report("map + compact") { N.times { enum.map { |i| i + 1 if i.even? }.compact } }
x.report("filter_map") { N.times { enum.filter_map { |i| i + 1 if i.even? } } }
end
# Rehearsal -------------------------------------------------
# select + map 8.569651 0.051319 8.620970 ( 8.632449)
# map + compact 7.392666 0.133964 7.526630 ( 7.538013)
# filter_map 6.923772 0.022314 6.946086 ( 6.956135)
# --------------------------------------- total: 23.093686sec
#
# user system total real
# select + map 8.550637 0.033190 8.583827 ( 8.597627)
# map + compact 7.263667 0.131180 7.394847 ( 7.405570)
# filter_map 6.761388 0.018223 6.779611 ( 6.790559)
Definitely compact is the best approach for solving this task. However, we can achieve the same result just with a simple subtraction:
[1, nil, 3, nil, nil] - [nil]
=> [1, 3]
In your example:
items.map! { |x| process_x url } # [1, 2, 3, 4, 5] => [1, nil, 3, nil, nil]
it does not look like the values have changed other than being replaced with nil. If that is the case, then:
items.select{|x| process_x url}
will suffice.
If you wanted a looser criterion for rejection, for example, to reject empty strings as well as nil, you could use:
[1, nil, 3, 0, ''].reject(&:blank?)
=> [1, 3, 0]
If you wanted to go further and reject zero values (or apply more complex logic to the process), you could pass a block to reject:
[1, nil, 3, 0, ''].reject do |value| value.blank? || value==0 end
=> [1, 3]
[1, nil, 3, 0, '', 1000].reject do |value| value.blank? || value==0 || value>10 end
=> [1, 3]
You can use #compact method on the resulting array.
[10, nil, 30, 40, nil].compact => [10, 30, 40]
each_with_object is probably the cleanest way to go here:
new_items = items.each_with_object([]) do |x, memo|
ret = process_x(x)
memo << ret unless ret.nil?
end
In my opinion, each_with_object is better than inject/reduce in conditional cases because you don't have to worry about the return value of the block.
One more way to accomplish it will be as shown below. Here, we use Enumerable#each_with_object to collect values, and make use of Object#tap to get rid of temporary variable that is otherwise needed for nil check on result of process_x method.
items.each_with_object([]) {|x, obj| (process x).tap {|r| obj << r unless r.nil?}}
Complete example for illustration:
items = [1,2,3,4,5]
def process x
rand(10) > 5 ? nil : x
end
items.each_with_object([]) {|x, obj| (process x).tap {|r| obj << r unless r.nil?}}
Alternate approach:
By looking at the method you are calling process_x url, it is not clear what is the purpose of input x in that method. If I assume that you are going to process the value of x by passing it some url and determine which of the xs really get processed into valid non-nil results - then, may be Enumerabble.group_by is a better option than Enumerable#map.
h = items.group_by {|x| (process x).nil? ? "Bad" : "Good"}
#=> {"Bad"=>[1, 2], "Good"=>[3, 4, 5]}
h["Good"]
#=> [3,4,5]

Resources