Use [].replace to make a copy of an array - ruby

I have a class where I was using the Array#shift instance method on an instance variable. I thought I made a "copy" of my instance variable but in fact I hadn't and shift was actually changing the instance variables.
For example, before I would have expected to get ["foo", "bar", "baz"] both times given the following:
class Foo
attr_reader :arr
def initialize arr
#arr = arr
end
def some_method
foo = arr
foo.shift
end
end
foo = Foo.new %w(foo bar baz)
p foo.arr #=> ["foo", "bar", "baz"]
foo.some_method
p foo.arr #=> ["bar", "baz"]
result:
["foo", "bar", "baz"]
["bar", "baz"]
But as shown my "copy" wasn't really a copy at all. Now, I'm not sure if I should be calling what I want a "copy", "clone", "dup", "deep clone", "deep dup", "frozen clone", etc...
I was really confused about what to search for and found a bunch of crazy attempts to do what seems like "making a copy of an array".
Then I found another answer with literally one line that solved my problem:
class Foo
attr_reader :arr
def initialize arr
#arr = arr
end
def some_method
foo = [].replace arr
foo.shift
end
end
foo = Foo.new %w(foo bar baz)
p foo.arr #=> ["foo", "bar", "baz"]
foo.some_method
p foo.arr #=> ["foo", "bar", "baz"]
output:
["foo", "bar", "baz"]
["foo", "bar", "baz"]
I understand that Array#replace is an instance method being called on an instance of Array that happens to be an empty array (so for example foo = ["cats", "and", "dogs"].replace arr will still work) and it makes sense that I get a "copy" of the instance variable #arr.
But how is that different than:
foo = arr
foo = arr.clone
foo = arr.dup
foo = arr.deep_clone
Marshal.load # something something
# etc...
Or any of the other crazy combinations of dup and map and inject that I'm seeing on SO?

This is the tricky concept of mutability in ruby. In terms of core objects, this usually comes up with arrays and hashes. Strings are mutable as well, but this can be disabled with a flag at the top of the script. See What does the comment "frozen_string_literal: true" do?.
In this case, you can call dup, deep_dup, clone easily to the same effect as replace:
['some', 'array'].dup
['some', 'array'].deep_dup
['some', 'array'].clone
Marshal.load Marshal::dump(['some', 'array'])
In terms of differences, dup and clone are the same except for some nuanced details - see What's the difference between Ruby's dup and clone methods?
The difference between these and deep_dup is that deep_dup works recursively. For example if you dup a nested array, the inner array will not be cloned:
a = [[1]]
b = a.clone
b[0][0] = 2
a # => [[2]]
The same thing happens with hashes.
Marshal.load Marshal::dump <object> is a general approach to deep cloning objects, which, unlike deep_dup, is in ruby core. Marshal::dump returns a string so it can be handy in serializing objects to file.
If you want to avoid unexpected errors like this, keep a mental index of which methods have side-effects and only call those when it makes sense to. An explanation point at the end of a method name indicates that it has side effects, but others include unshift, push, concat, delete, and pop. A big part of fuctional programming is avoiding side effects. You can see https://www.sitepoint.com/functional-programming-techniques-with-ruby-part-i/

The preferred method is dup
use array.dup whenever you need to copy an array
use array.map(&:dup) whenever you need to copy a 2D array
Don't use the marshalling trick unless you really want to deep copy an entire object graph. Usually you want to copy the arrays only but not the contained elements.

Related

Simplest way in Ruby to iterate over element plus next element, where "the next element" is nil on the last iteration?

I want to loop over an array like [:foo, :bar, :baz].
On each iteration, I want the item and the next item. On the last iteration, the "next item" should be nil.
So I want to yield in turn :foo, :bar, then :bar, :baz, and finally :baz, nil.
I can think of a few ways of achieving this:
my_list.zip(my_list.from(1)) { |item, next_item| … }
my_list.each.with_index(1) { |item, i| next_item = my_list[i] }
[*my_list, nil].each_cons(2) { |item, next_item| … }
But I feel like I might be missing some simpler way. Am I?
If you only every have three non-nil elements in your Array and always want two elements at a time from #each_cons, then just append a nil and call it a day.
%i[foo bar baz].push(nil).each_cons(2).map { [_1, _2] }
#=> [[:foo, :bar], [:bar, :baz], [:baz, nil]]
Alternatively, you can just reference each indexed element of the named Array by the current or successor index number, because any reference to elements outside the Array's bounds will be nil. For example:
array = %i[foo bar baz]
array.each_index.map { [array[_1], array[_1.succ]] }
#=> [[:foo, :bar], [:bar, :baz], [:baz, nil]]
In either case, you'll need #map or some other mechanism if you want to return an Array of Array objects. If you want to do something else, like yield, you can do that too.
Note that if you're going to have a variable number of Array elements, or make changes to the number of #each_cons items, then you'll need a different approach. If that is the case, feel free to open a new or related question.

Ruby: Is it true that #map generally doesn't make sense with bang methods?

This question was inspired by this one:
Ruby: Why does this way of using map throw an error?
Someone pointed out the following:
map doesn't make much sense when used with ! methods.
You should either:
use map with gsub
or use each with gsub!
Can someone explain why that is?
Base object
Here's an array with strings as element :
words = ['hello', 'world']
New array
If you want a new array with modified strings, you can use map with gsub :
new_words = words.map{|word| word.gsub('o','#') }
p new_words
#=> ["hell#", "w#rld"]
p words
#=> ["hello", "world"]
p new_words == words
#=> false
The original strings and the original array aren't modified.
Strings modified in place
If you want to modify the strings in place, you can use :
words.each{|word| word.gsub!('o','#') }
p words
#=> ["hell#", "w#rld"]
map and gsub!
new_words = words.map{|word| word.gsub!('o','#') }
p words
#=> ["hell#", "w#rld"]
p new_words
#=> ["hell#", "w#rld"]
p words == new_words
#=> true
p new_words.object_id
#=> 12704900
p words.object_id
#=> 12704920
Here, a new array is created, but the elements are the exact same ones!
It doesn't bring anything more than the previous examples. It creates a new Array for nothing. It also might confuse people reading your code by sending opposite signals :
gsub! will indicate that you want to modifiy existing objects
map will indicate that you don't want to modify existing objects.
Map is for building a new array without mutating the original. Each is for performing some action on each element of an array. Doing both at once is surprising.
>> arr = ["foo bar", "baz", "quux"]
=> ["foo bar", "baz", "quux"]
>> arr.map{|x| x.gsub!(' ', '-')}
=> ["foo-bar", nil, nil]
>> arr
=> ["foo-bar", "baz", "quux"]
Since !-methods generally have side effects (and only incidentally might return a value), each should be preferred to map when invoking a !-method.
An exception might be when you have a list of actions to perform. The method to perform the action might sensibly be named with a !, but you wish to collect the results in order to report which ones succeeded or failed.

How to get index of value in anonymous array inside of iteration

I would like to be able to take an anonymous array, iterate through it and inside of the iterator block find out what the index is of the current element.
For instance, I am trying to output only every third element.
["foo", "bar", "baz", "bang", "bamph", "foobar", "Hello, Sailor!"].each do |elem|
if index_of(elem) % 3 == 0 then
puts elem
end
end
(where index_of is a nonexistent method being used as a placeholder here to demonstrate what I'm trying to do)
In theory the output should be:
foo
bang
Hello, Sailor!
This is pretty straightforward when I'm naming the array. But when it is anonymous, I can't very well refer to the array by name. I've tried using self.find_index(elem) as well as self.index(elem) but both fail with the error: NoMethodError: undefined method '(find_)index' for main:Object
What is the proper way to do this?
Use each_with_index:
arr = ["foo", "bar", "baz", "bang", "bamph", "foobar", "Hello, Sailor!"]
arr.each_with_index do |elem, index|
puts elem if index % 3 == 0
end
Another way:
arr = ["foo", "bar", "baz", "bang", "bamph", "foobar", "Hello, Sailor!"]
arr.each_slice(3) { |a| puts a.first }
#=> foo
# bang
# Hello, Sailor!

Strange ruby for loop behavior (why does this work)

def reverse(ary)
result = []
for result[0,0] in ary
end
result
end
assert_equal ["baz", "bar", "foo"], reverse(["foo", "bar", "baz"])
This works and I want to understand why. Any explanations?
If I were to rewrite this using each instead of for/in, it would look like this:
def reverse(ary)
result = []
# for result[0,0] in ary
ary.each do |item|
result[0, 0] = item
end
result
end
for a in b basically says, take each item in the array b and assign it to expression a. So some magic happens when its not a simple variable.
The array[index, length] = something syntax allows replacement of multiple items, even 0 items. So ary[0,0] = item says to insert item at index zero, replacing zero items. It's basically an unshift operation.
But really, just use the each method with a block instead. A for loop with no body that changes state has to be one of the most obtuse and hard to read thing that doesn't do what you expect at first glance. each provides far fewer crazy surprises.
You are putting the value in ary at the first location of result. So lets say we had the array:
a = ["baz", "bar", "foo"]
So a[0,0] = 5 will make a equal to [5, "baz", "bar", "foo"]
Since you iterate over the entire array, you are inserting each element into the beginning of the result array while shifting the existing elements, thus reversing the original one.

Is there analogue for this ruby method?

Recently I've came up with this method:
module Enumerable
def transform
yield self
end
end
The purpose of method is similar to tap method but with the ability to modify object.
For example with this method I can change order in an array in chain style:
array.do_something.transform{ |a| [a[3],a[0],a[1],a[2]] }.do_something_else
Instead of doing this:
a0,a1,a2,a3 = array.do_something
result = [a3, a0, a1, a2].do_something_else
There are also another conveniences when using this method but...
The method is very straightforward, so I guess somewhere should be the already built method with the same purpose.
Is there analogue for this ruby method?
You can do that with instance_eval:
Evaluates (…) the given block, within the context of the receiver
Example:
%w(a b c d).instance_eval{|a| [a[3], a[0], a[1], a[2]] }
# => ["d", "a", "b", "c"]
or using self:
%w(a b c d).instance_eval{ [self[3], self[0], self[1], self[2]] }
# => ["d", "a", "b", "c"]
I can't test this now but you should be able to do something like this:
array= [1,2,3]
array.tap{ |a| a.clear }
Tap runs the block then returns self so if you can modify self in the block, it will pass back the updated array. In my example clear modifies self in the block so the modified self is returned.
If you want this functionality, I would suggest adding a method like do_something_else! that modifies self then running it within your tap block.
So where is the question? It all depends on how far you want to go into functional programming realm. Just read Learn You a Haskell for Great Good! and you will never be the same. Once you open this Pandora box it's really hard to stop and after some experimenting I wonder if I'm still writing Ruby code. Compare (using your transform method defined for Object)
h = {}
{:a => :a_method, :b => :b_method}.each do |k, m|
h[k] = some_object.__send__(m)
end
h.some_other_method
and
some_object.transform(&THash[:a => :a_method, :b => :b_method]).some_other_method
where
THash =
lambda do |t|
h = {}
kms = t.map { |k, v| [k, v.to_proc] }
lambda do |x|
kms.each { |k, m| h[k] = m[x] }
end
end
So if you want to think of your objects in terms of transformations, it makes perfect sense and does make code more readable, but it's more than just transform method, you need to define generic transformations you use frequently.
Basically it's called point-free programming, though some call it pointless. Depends on your mindset.

Resources