I just started studying Ruby a little while ago and I was having difficulties with global versus local variable scoping.
Working on a practice problem, I found that an array defined globally was being changed by a function called on it. If I explicitly assign the array to something else, nothing changes. But if I run through and delete items one by one, this deletes them from the global array itself.
Why do delete and pop (which I also tested) methods have this behavior? I understood from reading that this should not be happening, that the "array" inside the functions is a reference to the values of arr, rather than the variable arr.
(I'm using Ruby version 2+)
def change_int x
x += 2
end
def change_arr array
array = [4, 5, 6]
end
def pop_arr array
puts array
new_array = []
while array.length > 0
new_array.push array[0]
array.delete_at 0
end
array
end
x = 5
change_int x
puts x == 5 # true
arr = [1, 2, 3]
change_arr arr
puts arr == [1, 2, 3] # true
old_arr = arr
puts pop_arr arr
puts arr == [1, 2, 3] # false
puts "arr = #{arr}" # arr = []
You can see by printing #object_id before calling pop_arr and inside pop_arr that those arrays are the same objects. This means that arguments are passed into the function by reference in Ruby.
Here is code:
def pop_arr(array)
puts array.object_id
# Rest of the fucntion
end
arr = [1, 2, 3]
puts arr.object_id
pop_arr(arr)
All of this means that when you edit array inside the function it will have effect on the object which was passed. #delete, #delete_at, #pop are operations that change the Array on which they are made.
See also: Ruby - Parameters by reference or by value? and Is Ruby pass by reference or by value?.
The curious thing is that change_arr doesn't affect the global array, but pop_arr does, in your code.
Here's what's happening: ruby passes references to objects as parameters. So like Bartosz said, you can see that at the top of those methods, the object id matches the one you passed in; they're referencing the same object.
So, in pop_arr, when you call delete_at, you're operating on the same object that you passed in, and the changes persist after the method returns.
In change_arr, the difference is that you're assigning the internal var to a new object. When you pass in the parameter array, the internal variable references the same object you passed in. When you instantiate a new Array object and assign the internal array variable to it, the internal variable is now referencing a different object.
def change_arr array
puts "change id: #{array.object_id}"
array = [4, 5, 6]
puts "change id2: #{array.object_id}"
array
end
That's why the changes don't persist after the method ends. If you wanted the changes to persist, you'd have to say
array = change_arr(array)
Hope that helps.
Related
I have run into this construction:
Enumerator.new(&:map)
and couldn't figure out how I can use it.
For sure this is valid code and it returns the enumerator, but it doesn't have any collection to enumerate over.
How to feed a collection to this enumerator?
With the current setup you can't pass any collection to it. You can't change the collection of the enumerator once instantiated.
The current code only works because is block is not instantly executed, therefore you see the error when you try to start iterating (or retrieving items).
enumerator = Enumerator.new(&:map)
enumerator.take(1)
# NoMethodError (undefined method `map' for #<Enumerator::Yielder:0x00000000055b6e90>)
This is because Enumerator::new yields a Enumerator::Yielder which doesn't has the method map.
The above could also be written as:
enumerator = Enumerator.new { |yielder| yielder.map }
If you would like to create an enumerator from a collection the easiest way is to call each without block. Other methods like map also create enumerators without block given.
enumerator = [1, 2, 3].each
#=> #<Enumerator: [1, 2, 3]:each>
If you for some reason still want to create the enumerator by hand it could look like this:
enumerator = Enumerator.new { |yielder| [1, 2, 3].each { |number| yielder << number } }
If the intent was to preselect a method of iterating before you receive the collection, you can do so in the following manner:
# assuming both the collection and block are passed by the user
map = :map.to_proc
result = map.call(collection, &block)
# which is equivalent to
result = collection.map(&block)
I am iterating over an array, and I'm wondering if there's a shorthand to refer to the receiver of #each (or #each_with_index) method from within the iteration.
self returns main.
You should be able to just reference it:
my_thing.each {|one_thing| puts my_thing }
This is pretty similar to the answer I gave here https://stackoverflow.com/a/45421168/2981429 but slightly different.
First off, you can create a scope with self bound to the array, and then execute the each in that scope:
[1].instance_exec do
# in this scope, self is the array
# thus we can use just 'each' because the self is inferred
each do |x|
# note that since 'class' is a special keyword,
# it needs to be explicitly namespaced on self
puts self.class, x
end
end
# => prints Array, 1
You can create a utility function to do this, if you want:
def bound_each(enumerable, &blk)
enumerable.instance_exec { each &blk }
end
bound_each([1]) { |x| puts self.class, x }
# prints Array, 1
You can call your each method within an Object#tap block and reference the original receiver like that.
[1, 2, 3].tap { |i| i.each { |j| p i.dup << j } }
# [1, 2, 3, 1]
# [1, 2, 3, 2]
# [1, 2, 3, 3]
#=> [1, 2, 3]
Here the receiving object is [1, 2, 3] and is passed to the block-variable i which we can use locally or in nested scopes such as each's block.
Avoid modifying the receiving object else you may end up with undesired results such as an infinite array. Using dup could allay this possibility.
This is an interesting question. As far as I know it's not possible – the closest I can come up with would be to use inject (or reduce) and explicitly pass the receiver as an argument. A bit pointless, but there might be a use-case for it that I'm not seeing:
a = [1,2,3]
a.inject(a) do |this, element|
this == a #=> true
this.include?(element) #=> true
this
end
Apart from looking a bit redundant, you have to be very sure to return this at the end of each iteration, as the return value will become this in the next iteration. For that reason (and the fact that you could just reference your collection in an each block, as in David's answer) I don't recommend using this.
Edit - as Simple Lime pointed out in the comments – I missed the obvious Enumerator#with_object, which has the same (rather pointless) effect, but without the drawback of having to return this at the end of each iteration. For example:
a = [1,2,3]
a.map.with_object(a) do |element, this|
this == a #=> true, for each iteration
end
I still don't recommend that you use this though.
Ok, reviewing Procs, lambdas, and blocks via this link.
Question on this code:
class Array
def iterate!
self.each_with_index do |n, i|
self[i] = yield(n)
end
end
end
array = [1, 2, 3, 4]
array.iterate! do |n|
n ** 2
end
puts array.inspect
Conceptually, I understand almost everything, except one line which is this:
self[i] = yield(n)
I get that this self in this line self.each_with_index do |n, i| means that it's a class method, right?
But why do we need to assign the parameters in yield(n) to self[i]?
Please explain in super basic way if you can.
(in other words, please be nice - which people generally are for most part here - just a little extra nervous that I'm not getting this which is making me feel stupid)
The method is iterate!, which is an instance method. self in self.each_with_index is the receiver of the method Enumerable#each_with_instance. Since self is the current instance of Array ([1,2,3,4] in your example), self. is not needed; i.e., you could (and imo, should) just write each_with_index do |n, i|.... In other words, self is the implied receiver when no explicit receiver is specified.
Regarding the line:
self[i] = yield(n)
for your example array = [1,2,3,4] your enumerator is:
enum = [1,2,3,4].each_with_index
#=> #<Enumerator: [1, 2, 3, 4]:each_with_index>
with elements
enum.to_a
#=> [[1, 0], [2, 1], [3, 2], [4, 3]]
The first element passed into block by Array#each is therefore [1,0], which is assigned to the block variables:
n = 1
i = 0
resulting in
self[0] = yield(1) => 1**2 => 1
and so on.
I'll try to explain in a super basic way.
I get that this self in this line self.each_with_index do |n, i| means
that it's a class method, right?
Nope. The meaning of self depends on the context. If self was in the class, it would refer to the class. But here self is in an instance method, so it refers to the instance (so each_with_index is also an instance method).
But why do we need to assign the parameters in yield(n) to self[i]?
The goal of iterate! is to modify the array in place. Since self refers to the instance, self[i] accesses the elements of the array that iterate! is being called on, thus modifying the array in place.
Also, I'm not sure what you mean by "parameters" here. yield(n) passes n to the block, runs the block, and returns the value.
self[i] = yield(n) reassigns the values in the array, to the block that was specified in
array.iterate! do |n|
n ** 2
end
which basically means, take the value of the array, and square it, save that value in the element of the array. So [1, 2, 3 , 4] becomes [1 ** 2, 2 ** 2, 3 ** 2, 4 ** 2] => [2, 4, 9, 16]
Self changes with(and actually is) the current context or surrounding object.
Since
self.each_with_index do |n, i|
...
is monkey patching the Array class and is within an instance method iterate!, self refers to the instance itself: in this case the array [1, 2, 3, 4].
You're probably thinking of this:
class some_class
def self.a_class_method
...
which is defined in the context of a class. So self is the class itself(which is also an object), not an instance of that class.
Since self is just the array [1, 2, 3, 4]
self[i] = yield(n)
is replacing each element of the array with results of the sent in block.
Here iterate! is an instance function of Array class and you have an array object.When you do
array.iterate! do |n|
n ** 2
end
You are passing a block 'do |n| n**2 end' to iterate! function.In the function you can access this block using yield.But as you can see block is expecting one parameter through |n| so you need to pass one parameter and the block code will return the square of it.
self[i] = yield(n)
self is being used in Array instance context.So it is modifying the values of array.
For more information please check this article:
http://geekdirt.com/blog/blocks-lambda-and-procs-in-ruby/
I'm working on creating a method that pads an array, and accepts 1. a desired value and 2. an optional string/integer value. Desired_size reflects the desired number of elements in the array. If a string/integer is passed in as the second value, this value is used to pad the array with extra elements. I understand there is a 'fill' method that can shortcut this - but that would be cheating for the homework I'm doing.
The issue: no matter what I do, only the original array is returned. I started here:
class Array
def pad(desired_size, value = nil)
desired_size >= self.length ? return self : (desired_size - self.length).times.do { |x| self << value }
end
end
test_array = [1, 2, 3]
test_array.pad(5)
From what I researched the issue seemed to be around trying to alter self's array, so I learned about .inject and gave that a whirl:
class Array
def pad(desired_size, value = nil)
if desired_size >= self.length
return self
else
(desired_size - self.length).times.inject { |array, x| array << value }
return array
end
end
end
test_array = [1, 2, 3]
test_array.pad(5)
The interwebs tell me the problem might be with any reference to self so I wiped that out altogether:
class Array
def pad(desired_size, value = nil)
array = []
self.each { |x| array << x }
if desired_size >= array.length
return array
else
(desired_size - array.length).times.inject { |array, x| array << value }
return array
end
end
end
test_array = [1, 2, 3]
test_array.pad(5)
I'm very new to classes and still trying to learn about them. Maybe I'm not even testing them the right way with my test_array? Otherwise, I think the issue is I get the method to recognize the desired_size value that's being passed in. I don't know where to go next. Any advice would be appreciated.
Thanks in advance for your time.
In all 3 of your tries, you are returning the original array if desired_size is greater than the original array size. You have that backwards. In other words, you just return instead of padding.
Your first attempt was close. You need to:
1) Fix your conditional check.
2) It's OK to modify the self array, so the more complicated tries are not necessary.
3) Make sure you return self no matter what you do.
By modifying self, not only do you return the modified array, but you also change the array held by the variable test_array. So if you were to do:
test_array = [1, 2, 3]
puts test_array.pad(5, 4).inspect // prints [1, 2, 3, 4, 4]
puts test_array // prints [1, 2, 3, 4, 4]
In Ruby, when a function modifies self, the function name ends with a !, so if you were to write it modifying self, it would be better to name it pad!.
If you want to write it so that it doesn't modify self, you could start with:
array = self.dup
and then do all of your operations on array.
Recently I discovered that tap can be used in order to "drily" assign values to new variables; for example, for creating and filling an array, like this:
array = [].tap { |ary| ary << 5 if something }
This code will push 5 into array if something is truthy; otherwise, array will remain empty.
But I don't understand why after executing this code:
array = [].tap { |ary| ary += [5] if something }
array remains empty. Can anyone help me?
In the first case array and ary point to the same object. You then mutate that object using the << method. The object that both array and ary point to is now changed.
In the second case array and ary again both point to the same array. You now reassign the ary variable, so that ary now points to a new array. Reassigning ary however has no effect on array. In ruby reassigning a variable never effects other variables, even if they pointed to the same object before the reassignment.
In other words array is still empty for the same reason that x won't be 42 in the following example:
x = 23
y = x
y = 42 # Changes y, but not x
Edit: To append one array to another in-place you can use the concat method, which should also be faster than using +=.
I want to expand on this a bit:
array = [].tap { |ary| ary << 5 if something }
What this does (assuming something is true-ish):
assigns array to [], an empty array.
array.object_id = 2152428060
passes [] to the block as ary. ary and array are pointing to the same array object.
array.object_id = 2152428060
ary.object_id = 2152428060
ary << 5 << is a mutative method, meaning it will modify the receiving object. It is similar to the idiom of appending ! to a method call, meaning "modify this in place!", like in .map vs .map! (though the bang does not hold any intrinsic meaning on its own in a method name). ary has 5 inserted, so ary = array = [5]
array.object_id = 2152428060
ary.object_id = 2152428060
We end with array being equal to [5]
In the second example:
array = [].tap{ |ary| ary += [5] if something }
same
same
ary += 5 += is short for ary = ary + 5, so it is first modification (+) and then assignment (=), in that order. It gives the appearance of modifying an object in place, but it actually does not. It creates an entirely new object.
array.object_id = 2152428060
ary.object_id = 2152322420
So we end with array as the original object, an empty array with object_id=2152428060 , and ary, an array with one item containing 5 with object_id = 2152322420. Nothing happens to ary after this. It is uninvolved with the original assignment of array, that has already happened. Tap executes the block after array has been assigned.