Unexpected result with splat operator - ruby

I have a hash, whose values are an array of size 1:
hash = {:start => [1]}
I want to unpack the arrays as in:
hash.each_pair{ |key, value| hash[key] = value[0] } # => {:start=>1}
and I thought the *-operator as in the following would work, but it does not give the expected result:
hash.each_pair{ |key, value| hash[key] = *value } # => {:start=>[1]}
Why does *value return [1] and not 1?

Because the []= method applied to hash takes only one argument in addition to the key (which is put inside the [] part), and a splatted/expanded array, which is in general a sequence of values (which coincidentally happens to be a single element in this particular case) cannot be directly accepted as the argument as is splatted. So it is accepted by the argument of []= as an array after all.
In other words, an argument (of the []= method) must be an object, but splatted elements (such as :foo, :bar, :baz) are not an object. The only way to interpret them as an object is to put them back into an array (such as [:foo, :bar, :baz]).
Using the splat operator, you can do it like this:
hash.each_pair{|key, value| hash.[]= key, *value}

sawa and Ninigi already pointed out why the assignment doesn't work as expected. Here's my attempt.
Ruby's assignment features work regardless of whether you're assigning to a variable, a constant or by implicitly invoking an assignment method like Hash#[]= with the assignment operator. For the sake of simplicity, I'm using a variable in the following examples.
Using the splat operator in an assignment does unpack the array, i.e.
a = *[1, 2, 3]
is evaluated as:
a = 1, 2, 3
But Ruby also allows you to implicitly create arrays during assignment by listing multiple values. Therefore, the above is in turn equivalent to:
a = [1, 2, 3]
That's why *[1] results in [1] - it's unpacked, just to be converted back to an array.
Elements can be assigned separately using multiple assignment:
a, b = [1, 2, 3]
a #=> 1
b #=> 2
or just:
a, = [1, 2, 3]
a #=> 1
You could use this in your code (note the comma after hash[key]):
hash = {:start => [1]}
hash.each_pair { |key, values| hash[key], = values }
#=> {:start=>1}
But there's another and more elegant way: you can unpack the array by putting parentheses around the array argument:
hash = {:start => [1]}
hash.each_pair { |key, (value)| hash[key] = value }
#=> {:start=>1}
The parentheses will decompose the array, assigning the first array element to value.

Because Ruby is acting unexpectedly smart here.
True, the splash operator will "fold" and "unfold" an array, but the catch in your code is what you do with that fanned value.
Take this code into account:
array = ['a', 'b']
some_var = *array
array # => ['a', 'b']
As you can see the splat operator seemingly does nothing to your array, while this:
some_var, some_other_var = *array
some_var # => "a"
somet_other_var # => "b"
Will do what you'd expect it does.
It seems ruby just "figures" if you splat an array into a single variable, that you want the array, not the values.
EDIT: As sawa pointed out in the comments, hash[key] = is not identical to variable =. []= is an instance Method of Hash, with it's own C-Code under the hood, which COULD (in theory) lead to different behaviour in some instances. I don't know of any example, but that does not mean there is none.
But for the sake of simplicity, we can asume that the regular variable assignment behaves exactly identical to hash[key] =.

Related

Ruby Hash initialised with each_with_object behaving weirdly

Initializing Ruby Hash like:
keys = [0, 1, 2]
hash = Hash[keys.each_with_object([]).to_a]
is behaving weirdly when trying to insert a value into a key.
hash[0].push('a')
# will result into following hash:
=> {0=>["a"], 1=>["a"], 2=>["a"]}
I am just trying to insert into one key, but it's updating the value of all keys.
Yes, that each_with_object is super-weird in itself. That's not how it should be used. And the problem arises precisely because you mis-use it.
keys.each_with_object([]).to_a
# => [[0, []], [1, []], [2, []]]
You see, even though it looks like these arrays are separate, it's actually the same array in all three cases. That's why if you push an element into one, it appears in all others.
Here's a better way:
h = keys.each_with_object({}) {|key, h| h[key] = []}
# => {0=>[], 1=>[], 2=>[]}
Or, say
h = keys.zip(Array.new(keys.size) { [] }).to_h
Or a number of other ways.
If you don't care about hash having this exact set of keys and simply want all keys to have empty array as default value, that's possible too.
h = Hash.new { |hash, key| hash[key] = [] }
All your keys reference the same array.
A simplified version that explains the problem:
a = []
b = a
a.push('something')
puts a #=> ['something']
puts b #=> ['something']
Even though you have two variables (a and b) there is only one Array Object. So any changes to the array referenced by variable a will change the array referenced by variable b as well. Because it is the same object.
The long version of what you are trying to achieve would be:
keys = [1, 2, 3]
hash = {}
keys.each do |key|
hash[key] = []
end
And a shorter version:
[1, 2 ,3].each_with_object({}) do |key, accu|
accu[key] = []
end

Does Ruby Hash's keep a separate list of read values vs assigned values? [duplicate]

This question already has answers here:
Strange, unexpected behavior (disappearing/changing values) when using Hash default value, e.g. Hash.new([])
(4 answers)
Closed 4 years ago.
This is related to Ruby hash default value behavior
But maybe the explanation there doesn't include this part: it seems that Ruby's Hash default value are separate whether you "read it", or see "what is set"?
One example is:
foo = Hash.new([])
foo[123].push("hi")
p foo # => {}
p foo[123] # => ["hi"]
p foo # => {}
How is it that foo[123] has a value, but foo is all empty, is somewhat beyond my comprehension... the only way I can understand it is that Ruby Hash keeps a separate list for the "read" or "getter", while somehow the "internal" assigned value are different.
If one of Ruby's design principles is "to have the least amount of surprise to the programmers", then the foo is empty but foo[123] is something, is somewhat in this case, a surprise to me.
(I haven't seen that in other languages actually... if there is a case where another language has similar behavior, maybe it is easier to make a connection.)
Suppose `
h = Hash.new(:cat)
h[:a] = 1
h[:b] = 2
h #=> {:a=>1, :b=>2}
Now
h[:a] #=> 1
h[:b] #=> 2
h[:c] #=> :cat
h[:d] #=> :cat
h #=> {:a=>1, :b=>2}
h = Hash.new(:cat) defines an empty hash h with a default value of :cat. This means that if h does not have a key k, h[k] will return :cat, nothing more, nothing less. As you can see above, executing h[k] does not change the hash when k is :c or :d.
On the other hand,
h[:c] = h[:c]
#=> :c
h #=> {:a=>1, :b=>2, :c=>:cat}
Confused? Let me write this without the syntactic sugar:
h.[]=(:d, h.[](:d))
#=> :cat
h #=> {:a=>1, :b=>2, :d=>:cat}
The default value is returned by h.[](:d) (i.e., h[:d]) whereas Hash#[]= is an assignment method (that takes two arguments, a key and a value) to which the default does not apply.
A common use of this default is to create a counting hash:
a = [1,3,1,4,2,5,4,4]
h = Hash.new(0)
a.each { |x| h[x] = h[x] + 1 }
h #=> {1=>2, 3=>1, 4=>3, 2=>1, 5=>1}
Initially, when h is empty and x #=> 1, h[1] = h[1] + 1 will evaluate to h[1] = 0 + 1, because (since h has no key 1) h[1] on the right side of the equality is set equal to the default value of zero. The next time 1 is passed to the block (x #=> 1), x[1] = x[1] + 1, which equals x[1] = 1 + 1. This time the default value is not used because h now has a key 1.
This would normally be written (incidentally):
a.each_with_object(Hash.new(0)) { |x,h| h[x] += 1 }
#=> {1=>2, 3=>1, 4=>3, 2=>1, 5=>1}
One generally does not want the default value to be a collection, such as an array or hash. Consider the following:
h = Hash.new([])
[1,2,3].map { |n| h[n] = h[n] }
h #=> {1=>[], 2=>[], 3=>[]}
Now observe:
h[1] << 2
h #=> {1=>[2], 2=>[2], 3=>[2]}
This is normally not the desired behaviour. It has happened because
h.map { |k,v| v.object_id }
#=> [25886508, 25886508, 25886508]
That is, all the values are the same object, so if the value of one key is changed the values of all other keys are changed as well.
The way around this is to use a block when defining the hash:
h = Hash.new { |h,k| h[k]=[] }
[1,2,3].each { |n| h[n] = h[n] }
h #=> {1=>[], 2=>[], 3=>[]}
h[1] << 2
h #=> {1=>[2], 2=>[], 3=>[]}
h.map { |k,v| v.object_id }
#=> [24172884, 24172872, 24172848]
When the hash h does not have a key k the block { |h,k| h[k]=[] } is executed and returns an empty array specific to that key.
The statement:
foo = Hash.new([])
creates a new Hash that has an empty array ([] as default value). The default value is the value returned by Hash::[] when its argument is not a key present in the hash.
The statement:
foo[123]
invokes Hash::[] and, because the hash is empty (the key 123 is not present in the hash), it returns a reference to the default value which is an object of type Array.
The statement above doesn't create the 123 key in the hash.
Ruby objects are always passed and returned by reference. This means that the statement above doesn't return a copy of the default value of the hash but a reference to it.
The statement:
foo[123].push("hi")
modifies the above mentioned array. Now, the default value of the foo hash is not an empty array any more; it is the array ["hi"]. But the has is still empty; none of the above statements added some (key, value) pair to it.
How is it that foo[123] has a value
foo[123] doesn't have any value, the key 123 is not present in the hash (the hash is empty). A subsequent call to foo[123] returns a reference to the default value again and the default value now it's ["hi"]. And a call to foo[456] or foo['abc'] also returns a reference to the same default value.
You didn't actually change the value of key 123, you're just accessing the default value [] you provided during initialization. You can confirm this if you inspect a different value like foo[0].
If you would do this:
foo[123] = ["hi"]
you could see the new entry, because you've created a new array under the key 123.
Edit
When you call foo[123].push("hi"), you're mutating the (default) value instead of adding a new entry.
Calling foo[123] += ["hi"] creates a new array under the given key, replacing the previous one if it existed, which will show the behavior you desire.
Printing out the hash with:
p foo
only prints the values stored in the hash. It does not display the default value (or anything added to the default array).
When you execute:
p foo[123]
Because 123 does not exist, it access the default value.
If you added two values to the default value:
foo[123].push("hi")
foo[456].push("hello")
your output would be:
p foo # => {}
p foo[123] # => ["hi","hello"]
p foo # => {}
Here, poo[123] does again still not exist, so it prints out the contents of the default value.

Ruby how to return an element of a dictionary?

# dictionary = {"cat"=>"Sam"}
This a return a key
#dictionary.key(x)
This returns a value
#dictionary[x]
How do I return the entire element
"cat"=>"Sam"
#dictionary
should do the trick for you
whatever is the last evaluated expression in ruby is the return value of a method.
If you want to return the hash as a whole. the last line of the method should look like the line I have written above
Your example is a bit (?) misleading in a sense it only has one pair (while not necessarily), and you want to get one pair. What you call a "dictionary" is actually a hashmap (called a hash among Rubyists).
A hashrocket (=>) is a part of hash definition syntax. It can't be used outside it. That is, you can't get just one pair without constructing a new hash. So, a new such pair would look as: { key => value }.
So in order to do that, you'll need a key and a value in context of your code somewhere. And you've specified ways to get both if you have one. If you only have a value, then:
{ #dictionary.key(x) => x }
...and if just a key, then:
{ x => #dictionary[x] }
...but there is no practical need for this. If you want to process each pair in a hash, use an iterator to feed each pair into some code as an argument list:
#dictionary.each do |key, value|
# do stuff with key and value
end
This way a block of code will get each pair in a hash once.
If you want to get not a hash, but pairs of elements it's constructed of, you can convert your hash to an array:
#dictionary.to_a
# => [["cat", "Sam"]]
# Note the double braces! And see below.
# Let's say we have this:
#dictionary2 = { 1 => 2, 3 => 4}
#dictionary2[1]
# => 2
#dictionary2.to_a
# => [[1, 2], [3, 4]]
# Now double braces make sense, huh?
It returns an array of pairs (which are arrays as well) of all elements (keys and values) that your hashmap contains.
If you wish to return one element of a hash h, you will need to specify the key to identify the element. As the value for key k is h[k], the key-value pair, expressed as an array, is [k, h[k]]. If you wish to make that a hash with a single element, use Hash[[[k, h[k]]]].
For example, if
h = { "cat"=>"Sam", "dog"=>"Diva" }
and you only wanted to the element with key "cat", that would be
["cat", h["cat"]] #=> ["cat", "Sam"]
or
Hash[[["cat", h["cat"]]]] #=> {"cat"=>"Sam"}
With Ruby 2.1 you could alternatively get the hash like this:
[["cat", h["cat"]]].to_h #=> {"cat"=>"Sam"}
Let's look at a little more interesting case. Suppose you have an array arr containing some or all of the keys of a hash h. Then you can get all the key-value pairs for those keys by using the methods Enumerable#zip and Hash#values_at:
arr.zip(arr.values_at(*arr))
Suppose, for example,
h = { "cat"=>"Sam", "dog"=>"Diva", "pig"=>"Petunia", "owl"=>"Einstein" }
and
arr = ["dog", "owl"]
Then:
arr.zip(h.values_at(*arr))
#=> [["dog", "Diva"], ["owl", "Einstein"]]
In steps:
a = h.values_at(*arr)
#=> h.values_at(*["dog", "owl"])
#=> h.values_at("dog", "owl")
#=> ["Diva", "Einstein"]
arr.zip(a)
#=> [["dog", "Diva"], ["owl", "Einstein"]]
To instead express as a hash:
Hash[arr.zip(h.values_at(*arr))]
#=> {"dog"=>"Diva", "owl"=>"Einstein"}
You can get the key and value in one go - resulting in an array:
#h = {"cat"=>"Sam", "dog"=>"Phil"}
key, value = p h.assoc("cat") # => ["cat", "Sam"]
Use rassoc to search by value ( .rassoc("Sam") )

Pass arguments by reference to a block with the splat operator

It seems that the arguments are copied when using the splat operator to pass arguments to a block by reference.
I have this:
def method
a = [1,2,3]
yield(*a)
p a
end
method {|x,y,z| z = 0}
#=> this puts and returns [1, 2, 3] (didn't modified the third argument)
How can I pass these arguments by reference? It seems to work if I pass the array directly, but the splat operator would be much more practical, intuitive and maintainable here.
In Ruby when you write x = value you are creating a new local variable x whether it existed previously or not (if it existed the name is simply rebound and the original value remains untouched). So you won't be able to change a variable in-place this way.
Integers are immutable. So if you send an integer there is no way you can change its value. Note that you can change mutable objects (strings, hashes, arrays, ...):
def method
a = [1, 2, "hello"]
yield(*a)
p a
end
method { |x,y,z| z[1] = 'u' }
# [1, 2, "hullo"]
Note: I've tried to answer your question, now my opinion: updating arguments in methods or blocks leads to buggy code (you have no referential transparency anymore). Return the new value and let the caller update the variable itself if so inclined.
The problem here is the = sign. It makes the local variable z be assigned to another object.
Take this example with strings:
def method
a = ['a', 'b', 'c']
yield(*a)
p a
end
method { |x,y,z| z.upcase! } # => ["a", "b", "C"]
This clearly shows that z is the same as the third object of the array.
Another point here is your example is numeric. Fixnums have fixed ids; so, you can't change the number while maintaining the same object id. To change Fixnums, you must use = to assign a new number to the variable, instead of self-changing methods like inc! (such methods can't exist on Fixnums).
Yes... Array contains links for objects. In your code when you use yield(*a) then in block you works with variables which point to objects which were in array. Now look for code sample:
daz#daz-pc:~/projects/experiments$ irb
irb(main):001:0> a = 1
=> 1
irb(main):002:0> a.object_id
=> 3
irb(main):003:0> a = 2
=> 2
irb(main):004:0> a.object_id
=> 5
So in block you don't change old object, you just create another object and set it to the variable. But the array contain link to the old object.
Look at the debugging stuff:
def m
a = [1, 2]
p a[0].object_id
yield(*a)
p a[0].object_id
end
m { |a, b| p a.object_id; a = 0; p a.object_id }
Output:
3
3
1
3
How can I pass these arguments by reference?
You can't pass arguments by reference in Ruby. Ruby is pass-by-value. Always. No exceptions, no ifs, no buts.
It seems to work if I pass the array directly
I highly doubt that. You simply cannot pass arguments by reference in Ruby. Period.

Looping on array

I'm having some trouble figuring out the right way to do this:
I have an array and a separate array of arrays that I want to compare to the first array. The first array is a special Enumerable object that happens to contain an array.
Logic tells me that I should be able to do this:
[1,2,3].delete_if do |n|
[[2,4,5], [3,6,7]].each do |m|
! m.include?(n)
end
end
Which I would expect to return
=> [2,3]
But it returns [] instead.
This idea works if I do this:
[1,2,3].delete_if do |n|
! [2,4,5].include?(n)
end
It will return
=> [2]
I can't assign the values to another object, as the [1,2,3] array must stay its special Enumerable object. I'm sure there is a much simpler explanation to this than what I'm trying. Anybody have any ideas?
You can also flatten the multi-dimensional array and use the Array#& intersection operator to get the same result:
# cast your enumerable to array with to_a
e = [1,2,3].each
e.to_a & [[2,4,5], [3,6,7]].flatten
# => [2, 3]
Can't you just add the two inner array together, and and check the inclusion on the concatenated array?
[1,2,3].delete_if do |n|
!([2,4,5] + [3,6,7]).include?(n)
end
The problem is that the return value of each is the array being iterated over, not the boolean (which is lost). Since the array is truthy, the value returned back to delete_if is always true, so all elements are deleted. You should instead use any?:
[1,2,3].delete_if do |n|
![[2,4,5], [3,6,7]].any? do |m|
m.include?(n)
end
end
#=> [2, 3]

Resources