what's different between each and collect method in Ruby [duplicate] - ruby

This question already has answers here:
Array#each vs. Array#map
(7 answers)
Closed 6 years ago.
From this code I don't know the difference between the two methods, collect and each.
a = ["L","Z","J"].collect{|x| puts x.succ} #=> M AA K
print a.class #=> Array
b = ["L","Z","J"].each{|x| puts x.succ} #=> M AA K
print b.class #=> Array

Array#each takes an array and applies the given block over all items. It doesn't affect the array or creates a new object. It is just a way of looping over items. Also it returns self.
arr=[1,2,3,4]
arr.each {|x| puts x*2}
Prints 2,4,6,8 and returns [1,2,3,4] no matter what
Array#collect is same as Array#map and it applies the given block of code on all the items and returns the new array. simply put 'Projects each element of a sequence into a new form'
arr.collect {|x| x*2}
Returns [2,4,6,8]
And In your code
a = ["L","Z","J"].collect{|x| puts x.succ} #=> M AA K
a is an Array but it is actually an array of Nil's [nil,nil,nil] because puts x.succ returns nil (even though it prints M AA K).
And
b = ["L","Z","J"].each{|x| puts x.succ} #=> M AA K
also is an Array. But its value is ["L","Z","J"], because it returns self.

Array#each just takes each element and puts it into the block, then returns the original array. Array#collect takes each element and puts it into a new array that gets returned:
[1, 2, 3].each { |x| x + 1 } #=> [1, 2, 3]
[1, 2, 3].collect { |x| x + 1 } #=> [2, 3, 4]

each is for when you want to iterate over an array, and do whatever you want in each iteration. In most (imperative) languages, this is the "one size fits all" hammer that programmers reach for when you need to process a list.
For more functional languages, you only do this sort of generic iteration if you can't do it any other way. Most of the time, either map or reduce will be more appropriate (collect and inject in ruby)
collect is for when you want to turn one array into another array
inject is for when you want to turn an array into a single value

Here are the two source code snippets, according to the docs...
VALUE
rb_ary_each(VALUE ary)
{
long i;
RETURN_ENUMERATOR(ary, 0, 0);
for (i=0; i<RARRAY_LEN(ary); i++) {
rb_yield(RARRAY_PTR(ary)[i]);
}
return ary;
}
# .... .... .... .... .... .... .... .... .... .... .... ....
static VALUE
rb_ary_collect(VALUE ary)
{
long i;
VALUE collect;
RETURN_ENUMERATOR(ary, 0, 0);
collect = rb_ary_new2(RARRAY_LEN(ary));
for (i = 0; i < RARRAY_LEN(ary); i++) {
rb_ary_push(collect, rb_yield(RARRAY_PTR(ary)[i]));
}
return collect;
}
rb_yield() returns the value returned by the block (see also this blog post on metaprogramming).
So each just yields and returns the original array, while collect creates a new array and pushes the results of the block into it; then it returns this new array.
Source snippets: each, collect

The difference is what it returns. In your example above
a == [nil,nil,nil] (the value of puts x.succ) while b == ["L", "Z", "J"] (the original array)
From the ruby-doc, collect does the following:
Invokes block once for each element of
self. Creates a new array containing
the values returned by the block.
Each always returns the original array. Makes sense?

Each is a method defined by all classes that include the Enumerable module. Object.eachreturns a Enumerable::Enumerator Object. This is what other Enumerable methods use to iterate through the object. each methods of each class behaves differently.
In Array class when a block is passed to each, it performs statements of the block on each element, but in the end returns self.This is useful when you don't need an array, but you maybe just want to choose elements from the array and use the as arguments to other methods. inspect and map return a new array with return values of execution of the block on each element. You can use map! and collect! to perform operations on the original array.

I think an easier way to understand it would be as below:
nums = [1, 1, 2, 3, 5]
square = nums.each { |num| num ** 2 } # => [1, 1, 2, 3, 5]
Instead, if you use collect:
square = nums.collect { |num| num ** 2 } # => [1, 1, 4, 9, 25]
And plus, you can use .collect! to mutate the original array.

Related

Use of Variables in Block Method of Ruby [duplicate]

This question already has answers here:
What does the |variable| syntax mean? [duplicate]
(4 answers)
Closed 2 years ago.
I am learning Ruby program and I found the following while working on Arrays and Files
nums = Array.new(10) { |e| e = e * 2; }
puts nums
File.foreach("users.txt") { |line| puts line }
The program works well. However, I didn't know what is meant by |e| or |line| in the blocks
Kindly explain me the use of the variables in blocks
As Viktor mentioned they are block arguments. e represents an index of an item of an array in each iteration, line each line when you are iterating through the lines of a file.
Here is the pseudocode:
nums = Array.new(10) # Returns an array of size 10 filled up with nil values
for(i = 0; i < nums.length(); i++) {
e = i * 2 # This is `e` variable in the block
nums[i] = e
}
file_lines = File.readlines("users.txt")
for(i = 0; i < file_lines.length(); i++) {
line = file_lines[i] # This is `line` variable in the block
print(line)
}
By the way, in the first example, the assignment is unnecessary because after each iteration a block returns the last evaluated value, so you can rewrite it like this nums = Array.new(10) { |e| e * 2 }
However, I didn't know what is meant by |e| or |line| in the blocks
This is called a parameter list. A parameter is like a "hole" in a subroutine (in this case a block, but methods can also have parameters and thus a parameter list) that can be filled in later. The thing that is being used to fill in the hole is called an argument. Arguments are being passed in an argument list.
So, in this case you have a parameter list with one parameter called e, or line, respectively.
Note that the assignment in the first snippet is useless. There is no code after the assignment which uses the parameter again, so it doesn't do anything.
{ ... } defines a block and |...| holds the block's argument(s). Block arguments are similar to method arguments.
You can use do ... end instead of { ... } and split the code to multiple lines:
File.foreach("users.txt") do |line|
puts line
end
Array.new(10) { ... } creates a 10-element array. For each element, the block is called and the element's (zero-based) index is passed to the block. This is what e is in your example. You can choose the variable name yourself. (e for element, i for index, or n for number are typical variable names)
The block's return value then defines the element's value. Some examples:
Array.new(10) { |i| 5 } # not using the index at all
#=> [5, 5, 5, 5, 5, 5, 5, 5, 5, 5]
Array.new(10) { |i| i } # returning the unchanged index
#=> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Array.new(10) { |i| 10 - i } # subtracting the index from 10
#=> [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
Array.new(10) { |i| i * i } # multiplying the index by itself
#=> [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
As you can see, the index can be used to create a variety of sequences.
Also note that the assignment (e =) in your example is superfluous. Only the block's return value matters.
File.foreach passes the lines of a file to the given block while reading the file, one after another. Within the block, you can decide what to do with that line. In your example, you puts it. Unlike Array.new above, File.foreach doesn't use the block's return value afterwards.
In general, many Ruby methods accept blocks. It allows the method to pass values to the block for further processing. Some methods use the blocks return value, others don't.

Why does this code populate every array?

I have this piece of code that creates an array of arrays ([[],[],[]]) and an iterator that attempts to populate the respective arrays with numbers (shown below)
array = Array.new(3,[])
10.times do
array[0] << 2
array[1] << 3
array[2] << 4
end
When I execute this code, I expected to see this
[[2,2,2,2,etc....],[3,3,3,3,etc...],[4,4,4,4,4...etc]]
but instead I get this:
[[2,3,4,2,3,4,2,3,4....repeat],[2,3,4,2,3,4,2,3,4....repeat],[2,3,4,2,3,4,2,3,4....repeat]]
I've tried to walk through it with byebug and it makes no sense to me. What is going on here?
Array.new(3, []) is not equivalent to [[],[],[]]. It is equivalent to array = []; [array, array, array], an array of three references to the same array. After array[0] << 2, you have [[2], [2], [2]], because all three elements point at the same array.
What you want is Array.new(3) { [] }, with a block specifying the default; this way, a new [] is created for each element.
Even better, your whole code can be written as:
Array.new(3) { |i| Array.new(10, i + 2) }
The initializer you use is meant to be used with immutable objects, like creating an array of n times the same integer. For mutable objects use the block version. Here's the solution to your problem:
array = Array.new(3) { [] }
10.times do
array[0] << 2
array[1] << 3
array[2] << 4
end
array[0]
#=> [2, 2, 2, 2, 2, 2, 2, 2, 2, 2]
Read the documentation
The second argument populates the array with references to the same
object. Therefore, it is only recommended in cases when you need to
instantiate arrays with natively immutable objects such as Symbols,
numbers, true or false.
To create an array with separate objects a block can be passed
instead. This method is safe to use with mutable objects such as
hashes, strings or other arrays:
Array.new(4) { Hash.new } #=> [{}, {}, {}, {}]
This is also a quick way to build up multi-dimensional arrays:
empty_table = Array.new(3) { Array.new(3) }
Therefore, your code first line should be
array = Array.new(3) { Array.new(3) }

Rand method on array with use of range operator + array index

I have this little problem with this code: I want to pass parameter to this random_select method in form of array, and I want to have one random index returned. I know that It won't work due to the way that range operator works , for such purposes we have sample method. But can anyone explain me why is this code returning nil as one of its random value?
def random_select(array)
array[rand(array[0]..array[4])]
end
p random_select([1,2,3,4,5])
Because your range is accessing to array values, not to array indexes:
array = [1, 2, 3, 4, 5]
array[0] #=> 1
array[4] #=> 5
array[0]..array[4] #=> 1..5
What you want is achievable in this way:
def random_select(array)
indexes = 0...array.size #=> 0...5
array[rand(indexes)]
end
array = [1, 2, 3, 4, 5]
Array.new(100) { random_select array }.uniq.sort == array #=> true
Have you tried using sample?
def random_select(array)
array.sample
end
Choose a random element or n random elements from the array.
The elements are chosen by using random and unique indices into the
array in order to ensure that an element doesn’t repeat itself unless
the array already contained duplicate elements.
If the array is empty the first form returns nil and the second form
returns an empty array.
The optional rng argument will be used as the random number generator.

Each with index with object in Ruby

I am trying to iterate over an array and conditionally increment a counter. I am using index to compare to other array's elements:
elements.each_with_index.with_object(0) do |(element, index), diff|
diff += 1 unless other[index] == element
end
I can't get diff to change value even when changing it unconditionally.
This can be solved with inject:
elements.each_with_index.inject(0) do |diff, (element, index)|
diff += 1 unless other[index] == element
diff
end
But I am wondering if .each_with_index.with_object(0) is a valid construction and how to use it?
From ruby docs for each_with_object
Note that you can’t use immutable objects like numbers, true or false
as the memo. You would think the following returns 120, but since the
memo is never changed, it does not.
(1..5).each_with_object(1) { |value, memo| memo *= value } # => 1
So each_with_object does not work on immutable objects like integer.
You want to count the number of element wise differences, right?
elements = [1, 2, 3, 4, 5]
other = [1, 2, 0, 4, 5]
# ^
I'd use Array#zip to combine both arrays element wise and Array#count to count the unequal pairs:
elements.zip(other).count { |a, b| a != b } #=> 1

Ruby inject with index and brackets

I try to clean my Code. The first Version uses each_with_index. In the second version I tried to compact the code with the Enumerable.inject_with_index-construct, that I found here.
It works now, but seems to me as obscure as the first code.
Add even worse I don't understand the brackets around element,index in
.. .inject(groups) do |group_container, (element,index)|
but they are necessary
What is the use of these brackets?
How can I make the code clear and readable?
FIRST VERSION -- WITH "each_with_index"
class Array
# splits as good as possible to groups of same size
# elements are sorted. I.e. low elements go to the first group,
# and high elements to the last group
#
# the default for number_of_groups is 4
# because the intended use case is
# splitting statistic data in 4 quartiles
#
# a = [1, 8, 7, 5, 4, 2, 3, 8]
# a.sorted_in_groups(3) # => [[1, 2, 3], [4, 5, 7], [8, 8]]
#
# b = [[7, 8, 9], [4, 5, 7], [2, 8]]
# b.sorted_in_groups(2) {|sub_ary| sub_ary.sum } # => [ [[2, 8], [4, 5, 7]], [[7, 8, 9]] ]
def sorted_in_groups(number_of_groups = 4)
groups = Array.new(number_of_groups) { Array.new }
return groups if size == 0
average_group_size = size.to_f / number_of_groups.to_f
sorted = block_given? ? self.sort_by {|element| yield(element)} : self.sort
sorted.each_with_index do |element, index|
group_number = (index.to_f / average_group_size).floor
groups[group_number] << element
end
groups
end
end
SECOND VERSION -- WITH "inject" AND index
class Array
def sorted_in_groups(number_of_groups = 4)
groups = Array.new(number_of_groups) { Array.new }
return groups if size == 0
average_group_size = size.to_f / number_of_groups.to_f
sorted = block_given? ? self.sort_by {|element| yield(element)} : self.sort
sorted.each_with_index.inject(groups) do |group_container, (element,index)|
group_number = (index.to_f / average_group_size).floor
group_container[group_number] << element
group_container
end
end
end
What is the use of these brackets?
It's a very nice feature of ruby. I call it "destructuring array assignment", but it probably has an official name too.
Here's how it works. Let's say you have an array
arr = [1, 2, 3]
Then you assign this array to a list of names, like this:
a, b, c = arr
a # => 1
b # => 2
c # => 3
You see, the array was "destructured" into its individual elements. Now, to the each_with_index. As you know, it's like a regular each, but also returns an index. inject doesn't care about all this, it takes input elements and passes them to its block as is. If input element is an array (elem/index pair from each_with_index), then we can either take it apart in the block body
sorted.each_with_index.inject(groups) do |group_container, pair|
element, index = pair
# or
# element = pair[0]
# index = pair[1]
# rest of your code
end
Or destructure that array right in the block signature. Parentheses there are necessary to give ruby a hint that this is a single parameter that needs to be split in several.
Hope this helps.
lines = %w(a b c)
indexes = lines.each_with_index.inject([]) do |acc, (el, ind)|
acc << ind - 1 if el == "b"
acc
end
indexes # => [0]
What is the use of these brackets?
To understand the brackets, first you need to understand how destruction works in ruby. The simplest example I can think of this this:
1.8.7 :001 > [[1,3],[2,4]].each do |a,b|
1.8.7 :002 > puts a, b
1.8.7 :003?> end
1
3
2
4
You should know how each function works, and that the block receives one parameter. So what happens when you pass two parameters? It takes the first element [1,3] and try to split (destruct) it in two, and the result is a=1 and b=3.
Now, inject takes two arguments in the block parameter, so it is usually looks like |a,b|. So passing a parameter like |group_container, (element,index)| we are in fact taking the first one as any other, and destructing the second in two others (so, if the second parameter is [1,3], element=1 and index=3). The parenthesis are needed because if we used |group_container, element, index| we would never know if we are destructing the first or the second parameter, so the parenthesis there works as disambiguation.
9In fact, things works a bit different in the bottom end, but lets hide this for this given question.)
Seems like there already some answers given with good explanation. I want to add some information regards the clear and readable.
Instead of the solution you chose, it is also a possibility to extend Enumerable and add this functionality.
module Enumerable
# The block parameter is not needed but creates more readable code.
def inject_with_index(memo = self.first, &block)
skip = memo.equal?(self.first)
index = 0
self.each_entry do |entry|
if skip
skip = false
else
memo = yield(memo, index, entry)
end
index += 1
end
memo
end
end
This way you can call inject_with_index like so:
# m = memo, i = index, e = entry
(1..3).inject_with_index(0) do |m, i, e|
puts "m: #{m}, i: #{i}, e: #{e}"
m + i + e
end
#=> 9
If you not pass an initial value the first element will be used, thus not executing the block for the first element.
In case, someone is here from 2013+ year, you have each_with_object and with_index for your needs:
records.each_with_object({}).with_index do |(record, memo), index|
memo[record.uid] = "#{index} in collection}"
end

Resources