I'm confused about the different results I'm getting when performing simple addition/concatenation on integers, strings and arrays in Ruby. I was under the impression that when assigning variable b to a (see below), and then changing the value of a, that b would remain the same. And it does so in the first two examples. But when I modify Array a in the 3rd example, both a and b are modified.
a = 100
b = a
a+= 5
puts a
puts b
a = 'abcd'
b = a
a += 'e'
puts a
puts b
a = [1,2,3,4]
b = a
a << 5
puts a.inspect
puts b.inspect
The following is what was returned in Terminal for the above code:
Ricks-MacBook-Pro:programs rickthomas$ ruby variablework.rb
105
100
abcde
abcd
[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5]
Ricks-MacBook-Pro:programs rickthomas$
I was given the following explanation by my programming instructor:
Assigning something to a new variable is just giving it an additional label, it doesn't make a copy.
It looks like += is a method, just like <<, and so you'd expect it to behave similarly. But in reality, it's "syntactic sugar", something added to the language to make things easier on developers.
When you run a += 1, Ruby converts that to a = a + 1.
In this case, we're not modifying the Fixnum in a. Instead, we're actually re-assigning on top of it, effectively blowing away the previous value of a.
On the other hand, when you run b << "c", you're modifying the underlying Array by appending the String "c" to it.
My questions are these:
1) He mentions syntactic sugar, but isn't that also what << is, i.e. syntactic sugar for the .push method?
2) Why would it matter if += is syntactic sugar or a more formal method? If there is some difference between the two, then doesn't that mean my previously-understood of syntactic sugar ("syntax within a programming language that is designed to make things easier to read or to express") is incomplete, since this isn't its only purpose?
3) If assigning b to a doesn't make a copy of a, then why doesn't wiping away a's old value mean that b's old value is also wiped away for all 3 cases (Integer, String and Array)?
As you can see, I'm pretty turned around on something that I thought I understood until now. Any help is much appreciated!
You see, names (variable names, like a and b) don't hold any values themselves. They simply point to a value. When you make an assignment
a = 5
then a now points to value 5, regardless of what it pointed to previously. This is important.
a = 'abcd'
b = a
Here both a and b point to the same string. But, when you do this
a += 'e'
It's actually translated to
a = a + 'e'
# a = 'abcd' + 'e'
So, name a is now bound to a new value, while b keeps pointing to "abcd".
a = [1,2,3,4]
b = a
a << 5
There's no assignment here, method << modifies existing array without replacing it. Because there's no replacement, both a and b still point to the same array and one can see the changes made to another.
The answer to 1) and 2) of your question:
The reason why += is syntactic sugar and << is not is fairly simple: += abstracts some of the syntactic expression: a += 1 is just a short version of a = a + 1. << is a method all by itself and is not an alias for push: << can only take one argument, whereas push can take an arbitrary number of arguments: I'm demonstrating this with send here, since [1,2]<<(1,2) is syntactically incorrect:
[1,2].send(:<<, 4, 5) #=> ArgumentError: wrong number of arguments (2 for 1)
push appends all arguments to the array:
[1,2].push(4,5,6) #=> [1,2,4,5,6]
Therefore, << is an irreplaceable part of the ruby array, since there is no equivalent method. One could argue that it is some kind of syntactic sugar for push, with disregard for the differences shown above, since it makes most operations involving appending elements to an array simpler and syntactically more recognizable.
If we go deeper and have a look at the different uses of << throughout ruby:
Push An Element to an array:
[1,2] << 5
concatenate a string to another, here, << is actually an alias for concat
"hello " << "world"
Open up the singleton class and define a method on a class:
class Foo
class << self
def bar
puts 'baz'
end
end
end
And last but not least append self to self in Integers:
1 << 2 #translates to ((1 + 1) + (1 + 1))
We can see that << actually stands for append throughout ruby, since it always appears in a context where something is appended to something already existing. I would therefore rather argue that << is a significant part of the ruby syntax and not syntactic sugar.
And the answer to 3)
The reason why b's assignment is not modified (or wiped of its old value, as you put it) if you use the += operator is just that a += 1, as a short for a = a + 1, reassigns a's value and therefore assigns a new object along with that. << is modifying the original object. You can easily see this using the object_id:
a = 1
b = a
b.object_id == a.object_id #=> true
a += 1
b.object_id == a.object_id #=> false
a = [1,2]
b = a
b.object_id == a.object_id #=> true
a << 3
b.object_id == a.object_id #=> true
There are also some caveats to Integer instances (100, 101) and so on: the same number is always the same object, since it does not make any sense to have multiple instances of, for example 100:
a = 100
b = a
b.object_id == a.object_id #=> true
a += 1
b.object_id == a.object_id #=> false
a -= 1
b.object_id == a.object_id #=> true
This also shows that the value, or the Integer instance (100) is just assigned to the variable, so the variable itself is not an object, it just points to it.
String#+ :: str + other_str → new_str Concatenation—Returns a new String containing other_str concatenated to str.
String#<< :: str << integer → str : Append—Concatenates the given object to str.
<< doesn't create the new object, where as + does.
Sample1:
a = 100
p a.object_id
b = a
p b.object_id
a+= 5
p a.object_id
p b.object_id
puts a
puts b
Output:
201
201
211
201
105
100
Your example:
a = 100
b = a
a+= 5
is equivalent to:
a = 100
b = a
a = 100 + 5
Afterwards a holds a reference to 105 and b still holds a reference to 100. This is how assignment works in Ruby.
You expected += to change the object instance 100. In Ruby, however (quoting the docs):
There is effectively only one Fixnum object instance for any given integer value
So there's only one object instance for 100 and another (but always the same) one for 105. Changing 100 to 105 would change all 100's to 105. Therefore, it is not possible to modify these instances in Ruby, they are fixed.
A String instance on the other hand can be modified and unlike Integer there can be multiple instances for the same sequence of bytes:
a = "abcd"
b = "abcd"
a.equal? b # returns true only if a and b are the same object
# => false
a << "e" concatenates "e" to a, thus changing the receiver: a is still referencing the same object instance.
Other methods like a += "e" return (and assign) a new String: a would reference this new instance afterwards.
The documentation is pretty clear:
str + other_str → new_str
Concatenation—Returns a new String containing other_str concatenated to str.
str << obj → str
Append—Concatenates the given object to str.
I can answer your questions.
1) No, the << method is not syntactic sugar for push. They are both methods with different names. You can have objects in Ruby that define one but not the other (for example String).
2) For a normal method like <<, the only thing that can happen as a result of a << x is that the object that a is pointing to gets modified. The statements a << x or a.push(x) cannot create a new object and change the variable a to point at it. That's just how Ruby works. This kind of thing is called "calling a method".
The reason that += being syntactic sugar matters is that means it can be used to modify a variable without mutating the old object that the variable used to point to. Consider a += x. That statement can modify what object a is pointing to because it is syntactic sugar for an actual assignment to a:
a = a + x
There are two things happening above. First the + method is called on a with one argument of x. Then the return value of the + method, whatever it is, is assigned to the variable a.
3) The reason that your Array case is different is because you chose to mutate the array instead of creating a new array. You could have used += to avoid mutating the array. I think that these six examples that will clear things up for you and show you what is possible in Ruby:
Strings without mutations
a = "xy"
b = a
a += "z"
p a # => "xyz"
p b # => "xy"
Strings with mutations
a = "xy"
b = a
a << "z"
p a # => "xyz"
p b # => "xyz"
Arrays without mutations
a = [1, 2, 3]
b = a
a += [4]
p a # => [1, 2, 3, 4]
p b # => [1, 2, 3]
Arrays with mutations
a = [1, 2, 3]
b = a
a.concat [4]
p a # => [1, 2, 3, 4]
p b # => [1, 2, 3, 4]
Integers without mutations
a = 100
b = a
a += 5
puts a # => 105
puts b # => 100
Integers with mutations
Mutating an integer is actually not possible in Ruby. Writing a = b = 89 actually does create two copies of the number 89, and the number cannot be mutated ever. Only a few, special types of objects behave like this.
Conclusion
You should think of a variable as just a name, and an object as a nameless piece of data.
All objects in Ruby can be used in an immutable way where you never actually modify the contents of an object. If you do it that way, then you don't have to worry about the b variable in our examples changing on its own; b will always point to the same object and that object will never change. The variable b will only change when you do some form of b = x.
Most objects in Ruby can be mutated. If you have several variables referring to the same object and you choose to mutate the object (e.g. by calling push), then that change will affect all the variables that are pointing to the object. You cannot mutate Symbols and Integers.
I guess the above answers explain the reason. Note also that if you want to ensure b is no pointer, you can use b = a.dup instead of b=a (dup for duplicate )
I'll try and answer your question to the best of my ability.
Yes, both are "sugars" but they work differently and as Sergio Tulentsev said, << it's not really a sugar but it's an alias.
And those works as an alias in Unix like languages, it's a shorter shorthand for something named after your liking.
So for the first scenario: += basically what's happening is that you're saying:
for the value 100 assign label 'a'.
for label 'b' assign the value of label 'a'.
for label 'a' take the value of label 'a' and add 5 to label 'a's value and return a new value
print label 'a' #this now holds the value 105
print label 'b' #this now holds the value 100
Under the hood of Ruby this has to do with the += returning a new String when that happens.
For the second scenario: << it's saying:
for value [1,2,3,4] assign label 'a'
for label 'b' assign the value of label 'a'
for label 'a' do the '<<' thing on the value of label 'a'.
print label 'a'
print label 'b'
And if you're applying the << to a string it will modify the existing object and append to it.
So what's different. Well the difference is that the << sugar doesn't act like this:
a is the new value of a + 5
it acts like this:
5 into the value of 'a'
2) Because the way you use the syntactic sugar in this case is making it easier for the
developer to read and understand the code. It's a shorthand.
Well, shorthands, if you call them that instead, do serve diffrent purposes.
The syntactic sugar isn't homogenous ie. it doesn't work the same way for all "sugars".
3) On wiping values:
It's like this.
put value 100 into the label 'a'
put the value of label 'a' into label 'b'
remove label 'a' from the value.
So
a = 100
b = a
a = nil
puts a
puts b
=> 100
Variables in Ruby doesn't hold values they point to values!
Related
I have built a version of mastermind that checks a user's input and provides feedback based on how close the user's guess was to the winning sequence. If you're not familiar with the game, you get feedback indicating how many of your characters were guessed correctly at the same index and how many characters guessed are in the sequence, but at the wrong index. If there are duplicates in the guess, then you would not count the extra values unless they correspond to the same number of duplicates in the secret code.
Example: If the sequence is ["G","G","G","Y"] and the user guesses ["G", "Y","G","G"] then you'd want to return 2 for items at the same index and 2 for items at different indexes that are included in the secret sequence.
Another example: If the sequence is ["X","R","Y","T"] and the user guesses ["T","T","Y","Y"] then you'd return 1 for items at the same index 1 for the character guessed that is in the sequence but at the wrong index.
Anyway, to me this is not a simple problem to solve. Here's the code I used to get it to work, but it's not elegant. There must be a better way. I was hoping someone can tell me what I'm missing here?? New to Ruby...
def index_checker(input_array, sequence_array)
count = 0
leftover_input = []
leftover_sequence = []
input.each_with_index do |char, idx|
if char == sequence[idx]
count += 1
else
leftover_input << char
leftover_sequence << sequence[idx]
end
end
diff_index_checker(leftover_input, leftover_sequence, count)
end
def diff_index_checker(input, sequence, count)
count2 = 0
already_counted = []
input.each do |char|
if sequence.include?(char) && !already_counted.include?(char)
count2 += 1
already_counted << char
end
end
[count, count2]
end
Here's a clean Ruby solution, written in idiomatic Ruby object-oriented style:
class Mastermind
def initialize(input_array, sequence_array)
#input_array = input_array
#sequence_array = sequence_array
end
def matches
[index_matches, other_matches]
end
def results
[index_matches.size, other_matches.size]
end
private
attr_reader :input_array, :sequence_array
def index_matches
input_array.select.with_index { |e, i| e == sequence_array[i] }
end
def other_matches
non_exact_input & non_exact_sequence
end
def non_exact_input
array_difference(input_array, index_matches)
end
def non_exact_sequence
array_difference(sequence_array, index_matches)
end
# This method is based on https://stackoverflow.com/a/3852809/5961578
def array_difference(array_1, array_2)
counts = array_2.inject(Hash.new(0)) { |h, v| h[v] += 1; h }
array_1.reject { |e| counts[e] -= 1 unless counts[e].zero? }
end
end
You would use this class as follows:
>> input_array = ["G","G","G","Y"]
>> sequence_array = ["G", "Y","G","G"]
>> guess = Mastermind.new(input_array, sequence_array)
>> guess.results
#> [2, 2]
>> guess.matches
#> [["G", "G"], ["G", "Y"]]
Here's how it works. First everything goes into a class called Mastermind. We create a constructor for the class (which in Ruby is a method called initialize) and we have it accept two arguments: input array (the user guess), and sequence array (the answer).
We set each of these arguments to an instance variable, which is indicated by its beginning with #. Then we use attr_reader to create getter methods for #input_array and #sequence_array, which allows us to get the values by calling input_array and sequence_array from any instance method within the class.
We then define two public methods: matches (which returns an array of exact matches and an array of other matches (the ones that match but at the wrong index), and results (which returns a count of each of these two arrays).
Now, within the private portion of our class, we can define the guts of the logic. Each method has a specific job, and each is named to (hopefully) help a reader understand what it is doing.
index_matches returns a subset of the input_array whose elements match the sequence_array exactly.
other_matches returns a subset of the input_array whose elements do not match the sequence_array exactly, but do match at the wrong index.
other_matches relies on non_exact_input and non_exact_sequence, each of which is computed using the array_difference method, which I copied from another SO answer. (There is no convenient Ruby method that allows us to subtract one array from another without deleting duplicates).
Code
def matches(hidden, guess)
indices_wo_match = hidden.each_index.reject { |i| hidden[i] == guess[i] }
hidden_counts = counting_hash(hidden.values_at *indices_wo_match)
guess_counts = counting_hash(guess.values_at *indices_wo_match)
[hidden.size - indices_wo_match.size, guess_counts.reduce(0) { |tot, (k, cnt)|
tot + [hidden_counts[k], cnt].min }]
end
def counting_hash(arr)
arr.each_with_object(Hash.new(0)) { |s, h| h[s] += 1 }
end
Examples
matches ["G","G","G","Y"], ["G", "Y","G","G"]
#=> [2, 2]
matches ["X","R","Y","T"] , ["T","T","Y","Y"]
#=> [1, 1]
Explanation
The steps are as follows.
hidden = ["G","G","G","Y"]
guess = ["G", "Y","G","G"]
Save the indices i for which hidden[i] != guess[i].
indices_wo_match = hidden.each_index.reject { |i| hidden[i] == guess[i] }
#=> [1, 3]
Note that the number of indices for which the values are equal is as follows.
hidden.size - indices_wo_match.size
#=> 2
Now compute the numbers of remaining elements of guess that pair with one of the remaining values of hidden by having the same value. Begin by counting the numbers of instances of each unique element of hidden and then do the same for guess.
hidden_counts = counting_hash(hidden.values_at *indices_wo_match)
#=> {"G"=>1, "Y"=>1}
guess_counts = counting_hash(guess.values_at *indices_wo_match)
#=> {"Y"=>1, "G"=>1}
To understand how counting_hash works, see Hash::new, especially the explanation of the effect of providing a default value as an argument of new. In brief, if a hash is defined h = Hash.new(3), then if h does not have a key k, h[k] returns the default value, here 3 (the hash is not changed).
Now compute the numbers of matches of elements of guess that were not equal to the value of hidden at the same index and which pair with an element of hidden that have the same value.
val_matches = guess_counts.reduce(0) do |tot, (k, cnt)|
tot + [hidden_counts[k], cnt].min
end
#=> 2
Lastly, return the values of interest.
[hidden.size - indices_wo_match.size, val_matches]
#=> [2, 2]
In the code presented above I have substituted out the variable val_matches.
With Ruby 2.4+ one can use Enumerable#sum to replace
guess_counts.reduce(0) { |tot, (k, cnt)| tot + [hidden_counts[k], cnt].min }
with
guess_counts.sum { |k, cnt| [hidden_counts[k], cnt].min }
def judge(secret, guess)
full = secret.zip(guess).count { |s, g| s == g }
semi = secret.uniq.sum { |s| [secret.count(s), guess.count(s)].min } - full
[full, semi]
end
Demo:
> judge(["G","G","G","Y"], ["G","Y","G","G"])
=> [2, 2]
> judge(["X","R","Y","T"], ["T","T","Y","Y"])
=> [1, 1]
A shorter alternative, though I find it less clear:
full = secret.zip(guess).count(&:uniq!)
I prefer my other answer for its simplicity, but this one would be faster if someone wanted to use this for arrays larger than Mastermind's.
def judge(secret, guess)
full = secret.zip(guess).count { |s, g| s == g }
pool = secret.group_by(&:itself)
[full, guess.count { |g| pool[g]&.pop } - full]
end
Demo:
> judge(["G","G","G","Y"], ["G","Y","G","G"])
=> [2, 2]
> judge(["X","R","Y","T"], ["T","T","Y","Y"])
=> [1, 1]
So I need to create an instance method for Array that takes two arguments, the size of an array and an optional object that will be appended to an array.
If the the size argument is less than or equal to the Array.length or the size argument is equal to 0, then just return the array. If the optional argument is left blank, then it inputs nil.
Example output:
array = [1,2,3]
array.class_meth(0) => [1,2,3]
array.class_meth(2) => [1,2,3]
array.class_meth(5) => [1,2,3,nil,nil]
array.class_meth(5, "string") => [1,2,3,"string","string"]
Here is my code that I've been working on:
class Array
def class_meth(a ,b=nil)
self_copy = self
diff = a - self_copy.length
if diff <= 0
self_copy
elsif diff > 0
a.times {self_copy.push b}
end
self_copy
end
def class_meth!(a ,b=nil)
# self_copy = self
diff = a - self.length
if diff <= 0
self
elsif diff > 0
a.times {self.push b}
end
self
end
end
I've been able to create the destructive method, class_meth!, but can't seem to figure out a way to make it non-destructive.
Here's (IMHO) a cleaner solution:
class Array
def class_meth(a, b = nil)
clone.fill(b, size, a - size)
end
def class_meth!(a, b = nil)
fill(b, size, a - size)
end
end
I think it should meet all your needs. To avoid code duplication, you can make either method call the other one (but not both simulaneously, of course):
def class_meth(a, b = nil)
clone.class_meth!(a, b)
end
or:
def class_meth!(a, b = nil)
replace(class_meth(a, b))
end
As you problem has been diagnosed, I will just offer a suggestion for how you might do it. I assume you want to pass two and optionally three, not one and optionally two, parameters to the method.
Code
class Array
def self.class_meth(n, arr, str=nil)
arr + (str ? ([str] : [nil]) * [n-arr.size,0].max)
end
end
Examples
Array.class_meth(0, [1,2,3])
#=> [1,2,3]
Array.class_meth(2, [1,2,3])
#=> [1,2,3]
Array.class_meth(5, [1,2,3])
#=> [1,2,3,nil,nil]
Array.class_meth(5, [1,2,3], "string")
#=> [1,2,3,"string","string"]
Array.class_meth(5, ["dog","cat","pig"])
#=> [1,2,3,"string","string"]
Array.class_meth(5, ["dog","cat","pig"], "string")
#=> [1,2,3,"string","string"]
Array.class_meth(5, ["dog","cat","pig"])
#=> ["dog", "cat", "pig", nil, nil]
Array.class_meth(5, ["dog","cat","pig"], "string")
#=> ["dog", "cat", "pig", "string", "string"]
Before withdrawing his answer, #PatriceGahide suggested using Array#fill. That would be an improvement here; i.e., replace the operative line with:
arr.fill(str ? str : nil, arr.size, [n-arr.size,0].max)
self_copy = self does not make a new object - assignment in Ruby never "copies" or creates a new object implicitly.
Thus the non-destructive case works on the same object (the instance the method was invoked upon) as in the destructive case, with a different variable bound to the same object - that is self.equal? self_copy is true.
The simplest solution is to merely use #clone, keeping in mind it is a shallow clone operation:
def class_meth(a ,b=nil)
self_copy = self.clone # NOW we have a new object ..
# .. so we can modify the duplicate object (self_copy)
# down here without affecting the original (self) object.
end
If #clone cannot be used other solutions involve create a new array or obtain an array #slice (returns a new array) or even append (returning a new array) with #+; however, unlike #clone, these generally lock-into returning an Array and not any sub-type as may be derived.
After the above change is made it should also be apparent that it can written as so:
def class_meth(a ,b=nil)
clone.class_meth!(a, b) # create a NEW object; modify it; return it
# (assumes class_meth! returns the object)
end
A more appropriate implementation of #class_meth!, or #class_meth using one of the other forms to avoid modification of the current instance, is left as an exercise.
FWIW: Those are instance methods, which is appropriate, and not "class meth[ods]"; don't be confused by the ill-naming.
I need to establish a number of Hashes, and I didn't want to list one per line, like this
a = Hash.new
b = Hash.new
I also new that apart from for Fixnums, I could not do this
a = b = Hash.new
because both a and b would reference the same object. What I could to is this
a, b, = Hash.new, Hash.new
if I had a bunch it seemed like I could also do this
a, b = [Hash.new] * 2
this works for strings, but for Hashes, they still all reference the same object, despite the fact that
[Hash.new, Hash.new] == [Hash.new] * 2
and the former works.
See the code sample below, the only error message triggered is "multiplication hash broken". Just curious why this is.
a, b, c = [String.new] * 3
a = "hi"
puts "string broken" unless b == ""
puts "not equivalent" unless [Hash.new, Hash.new, Hash.new] == [Hash.new] * 3
a, b, c = [Hash.new, Hash.new, Hash.new]
a['hi'] = :test
puts "normal hash broken" unless b == {}
a, b, c = [Hash.new] * 3
a['hi'] = :test
puts "multiplication hash broken" unless b == {}
In answer to the original question, an easy way to initialize multiple copies would be to use the Array.new(size) {|index| block } variant of Array.new
a, b = Array.new(2) { Hash.new }
a, b, c = Array.new(3) { Hash.new }
# ... and so on
On a side note, in addition to the assignment mix-up, the other seeming issue with the original is that it appears you might be making the mistake that == is comparing object references of Hash and String. Just to be clear, it doesn't.
# Hashes are considered equivalent if they have the same keys/values (or none at all)
hash1, hash2 = {}, {}
hash1 == hash1 #=> true
hash1 == hash2 #=> true
# so of course
[Hash.new, Hash.new] == [Hash.new] * 2 #=> true
# however
different_hashes = [Hash.new, Hash.new]
same_hash_twice = [Hash.new] * 2
different_hashes == same_hash_twice #=> true
different_hashes.map(&:object_id) == same_hash_twice.map(&:object_id) #=> false
My understanding is this. [String.new] * 3 does not create three String objects. It creates one, and creates a 3-element array where each element points to that same object.
The reason you don't see "string broken" is that you have assigned a to a new value. So after the line a = "hi", a refers to a new String object ("hi") while b and c still refer to the same original object ("").
The same occurs with [Hash.new] * 3; but this time you don't re-assign any variables. Rather, you modify the one Hash object by adding the key/value [hi, :test] (via a['hi'] = :test). In this step you've modified the one object referred to by a, b, and c.
Here's a contrived code example to make this more concrete:
class Thing
attr_accessor :value
def initialize(value)
#value = value
end
end
# a, b, and c all refer to the same Thing object
a, b, c = [Thing.new(0)] * 3
# Here we *modify* that object
a.value = 5
# Verify b refers to the same object as a -- outputs "5"
puts b.value
# Now *assign* a to a NEW Thing object
a = Thing.new(10)
# Verify a and b now refer to different objects -- outputs "10, 5"
puts "#{a.value}, #{b.value}"
Does that make sense?
Update: I'm no Ruby guru, so there might be a more common-sense way to do this. But if you wanted to be able to use multiplication-like syntax to initialize an array with a bunch of different objects, you might consider this approach: create an array of lambdas, then call all of them using map.
Here's what I mean:
def call_all(lambdas)
lambdas.map{ |f| f.call }
end
a, b, c = call_all([lambda{Hash.new}] * 3)
You can verify that this approach works pretty easily:
x, y, z = call_all([lambda{rand(100)}] * 3)
# This should output 3 random (probably different) numbers
puts "#{x}, #{y}, #{z}"
Update 2: I like numbers1311407's approach using Array#new a lot better.
Recently I discovered that tap can be used in order to "drily" assign values to new variables; for example, for creating and filling an array, like this:
array = [].tap { |ary| ary << 5 if something }
This code will push 5 into array if something is truthy; otherwise, array will remain empty.
But I don't understand why after executing this code:
array = [].tap { |ary| ary += [5] if something }
array remains empty. Can anyone help me?
In the first case array and ary point to the same object. You then mutate that object using the << method. The object that both array and ary point to is now changed.
In the second case array and ary again both point to the same array. You now reassign the ary variable, so that ary now points to a new array. Reassigning ary however has no effect on array. In ruby reassigning a variable never effects other variables, even if they pointed to the same object before the reassignment.
In other words array is still empty for the same reason that x won't be 42 in the following example:
x = 23
y = x
y = 42 # Changes y, but not x
Edit: To append one array to another in-place you can use the concat method, which should also be faster than using +=.
I want to expand on this a bit:
array = [].tap { |ary| ary << 5 if something }
What this does (assuming something is true-ish):
assigns array to [], an empty array.
array.object_id = 2152428060
passes [] to the block as ary. ary and array are pointing to the same array object.
array.object_id = 2152428060
ary.object_id = 2152428060
ary << 5 << is a mutative method, meaning it will modify the receiving object. It is similar to the idiom of appending ! to a method call, meaning "modify this in place!", like in .map vs .map! (though the bang does not hold any intrinsic meaning on its own in a method name). ary has 5 inserted, so ary = array = [5]
array.object_id = 2152428060
ary.object_id = 2152428060
We end with array being equal to [5]
In the second example:
array = [].tap{ |ary| ary += [5] if something }
same
same
ary += 5 += is short for ary = ary + 5, so it is first modification (+) and then assignment (=), in that order. It gives the appearance of modifying an object in place, but it actually does not. It creates an entirely new object.
array.object_id = 2152428060
ary.object_id = 2152322420
So we end with array as the original object, an empty array with object_id=2152428060 , and ary, an array with one item containing 5 with object_id = 2152322420. Nothing happens to ary after this. It is uninvolved with the original assignment of array, that has already happened. Tap executes the block after array has been assigned.
Inspired by How can I marshal a hash with arrays? I wonder what's the reason that Array#<< won't work properly in the following code:
h = Hash.new{Array.new}
#=> {}
h[0]
#=> []
h[0] << 'a'
#=> ["a"]
h[0]
#=> [] # why?!
h[0] += ['a']
#=> ["a"]
h[0]
#=> ["a"] # as expected
Does it have to do with the fact that << changes the array in-place, while Array#+ creates a new instance?
If you create a Hash using the block form of Hash.new, the block gets executed every time you try to access an element which doesn't actually exist. So, let's just look at what happens:
h = Hash.new { [] }
h[0] << 'a'
The first thing that gets evaluated here, is the expression
h[0]
What happens when it gets evaluated? Well, the block gets run:
[]
That's not very exciting: the block simply creates an empty array and returns it. It doesn't do anything else. In particular, it doesn't change h in any way: h is still empty.
Next, the message << with one argument 'a' gets sent to the result of h[0] which is the result of the block, which is simply an empty array:
[] << 'a'
What does this do? It adds the element 'a' to an empty array, but since the array doesn't actually get assigned to any variable, it is immediately garbage collected and goes away.
Now, if you evaluate h[0] again:
h[0] # => []
h is still empty, since nothing ever got assigned to it, therefore the key 0 is still non-existent, which means the block gets run again, which means it again returns an empty array (but note that it is a completely new, different empty array now).
h[0] += ['a']
What happens here? First, the operator assign gets desugared to
h[0] = h[0] + ['a']
Now, the h[0] on the right side gets evaluated. And what does it return? We already went over this: h[0] doesn't exist, therefore the block gets run, the block returns an empty array. Again, this is a completely new, third empty array now. This empty array gets sent the message + with the argument ['a'], which causes it to return yet another new array which is the array ['a']. This array then gets assigned to h[0].
Lastly, at this point:
h[0] # => ['a']
Now you have finally actually put something into h[0] so, obviously, you get out what you put in.
So, to answer the question you probably had, why don't you get out what you put in? You didn't put anything in in the first place!
If you actually want to assign to the hash inside the block, you have to, well assign to the hash inside the block:
h = Hash.new {|this_hash, nonexistent_key| this_hash[nonexistent_key] = [] }
h[0] << 'a'
h[0] # => ['a']
It's actually fairly easy to see what is going on in your code example, if you look at the identities of the objects involved. Then you can see that everytime you call h[0], you get a different array.
The problem in your code is that h[0] << 'a' makes an new Array and gives it out when you index with h[0], but doesn't store the modified Array anywhere after the << 'a' because there is no assignment.
Meanwhile h[0] += ['a'] works because it's equivalent to h[0] = h[0] + ['a']. It's the assignment ([]=) that makes the difference.
The first case may seem confusing, but it is useful when you just want to receive some unchanging default element from a Hash when the key is not found. Otherwise you could end up populating the Hash with a great number of unused values just by indexing it.
h = Hash.new{ |a,b| a[b] = Array.new }
h[0] << "hello world"
#=> ["hello world"]
h[0]
#=> ["hello world"]