Ruby Hash populated from Array.product yields unexpected behavior - ruby

I wanted to pre-populate a Hash, given an array of keys and a default value (an empty array). I attempted to do this using the #product method of Array.
> hash = Hash[[:foo, :bar].product([[]])] # => {:foo=>[], :bar=>[]}
> hash[:foo].push(:baz) # => {:foo=>[:baz], :bar=>[:baz]}
I don't understand why the value is being applied to all keys in the hash. If instead, I use the returned value of product and populate the hash directly from that, I get expected behavior.
> [:foo, :bar].product([[]]) # => [[:foo, []], [:bar, []]]
> hash = Hash[[[:foo, []], [:bar, []]]] # => {:foo=>[], :bar=>[]}
> hash[:foo].push(:baz) # => {:foo=>[:baz], :bar=>[]}
I am using ruby 2.3.6

It's because the arrays that you pass to your hash initializer are the same object, so if you modify said object, the changes will be present everywhere it is used:
> hash = Hash[[:foo, :bar].product([[]])]
# => {:foo=>[], :bar=>[]}
> hash[:foo].object_id
# => 47106586247680
> hash[:bar].object_id
# => 47106586247680
If you copy-paste the output of your product, you're using 2 different arrays as they get instantiated separately.

Related

strange Hash behavior for nested assignments with defaults [duplicate]

This question already has answers here:
Strange, unexpected behavior (disappearing/changing values) when using Hash default value, e.g. Hash.new([])
(4 answers)
Closed 7 years ago.
console output:
2.1.3 :011 > b = Hash.new( Hash.new([]) )
=> {}
2.1.3 :012 > b[:a][:b] << 'hello'
=> ["hello"]
2.1.3 :013 > b
=> {}
2.1.3 :014 > b.size
=> 0
2.1.3 :015 > b.keys
=> []
2.1.3 :016 > b[:a][:b]
=> ["hello"]
Why is that I can access the value stored at b[:a][:b] yet b has a size of 0 and no keys?
new(obj) → new_hash
If obj is specified, this single object will be used for all default values.
Now Hash.new([]) is holding the default Array object. Now b[:a][:b] << 'hello' you are entering, the "hello" to the default Array.The default value is being returned, when the key doesn't exist in the Hash.
Don't think you are adding keys to the Hash objects, with this b[:a][:b] << 'hello' line.
b[:a] is giving the default Hash object, which is Hash.new([]). Now on this Hash object you are calling Hash#[] using the key :b, but as :b is the non existent key, it is giving the default Array object.
That's why b, b.size and b.keys all are proving that Hash is empty.
Finally.
Why is that I can access the value stored at b[:a][:b] yet b has a size of 0 and no keys?
Because, you added the value "Hello" to the default Array, as I mentioned above. That value is coming when you are using the line b[:a][:b].

Cloning a Hash in Ruby2 [duplicate]

This question already has answers here:
How to create a deep copy of an object in Ruby?
(9 answers)
Closed 8 years ago.
Im trying to clone a hash, to make a new copy of the original hash but it seems that when I set a value in the new hash, I have the same effect on the original hash.
rr = Hash.new
command = "/usr/local/bin/aws route53 list-resource-record-sets --hosted-zone-id EXAMPLEID --max-items 1"
rr=JSON.parse(%x{#{command}})
puts rr
if rr["ResourceRecordSets"][0]["TTL"] != 60
new_rr = rr.clone
new_rr["ResourceRecordSets"][0]["TTL"] = 60
puts rr
puts new_rr
end
Output:
{"NextRecordType"=>"MX", "NextRecordName"=>"example.com.", "ResourceRecordSets"=>[{"ResourceRecords"=>[{"Value"=>"1.2.3.4"}], "Type"=>"A", "Name"=>"example.com.", "TTL"=>1800}], "MaxItems"=>"1", "IsTruncated"=>true}
{"NextRecordType"=>"MX", "NextRecordName"=>"example.com.", "ResourceRecordSets"=>[{"ResourceRecords"=>[{"Value"=>"1.2.3.4"}], "Type"=>"A", "Name"=>"example.com.", "TTL"=>60}], "MaxItems"=>"1", "IsTruncated"=>true}
{"NextRecordType"=>"MX", "NextRecordName"=>"example.com.", "ResourceRecordSets"=>[{"ResourceRecords"=>[{"Value"=>"1.2.3.4"}], "Type"=>"A", "Name"=>"example.com.", "TTL"=>60}], "MaxItems"=>"1", "IsTruncated"=>true}
I dont see Hash.clone documented in Ruby 2.0, should I be using another method to create a Hash copy now?
Thanks in advance.
Hash is a collection of keys and values, where values are references to objects. When duplicating a hash, new hash is being created, but all object references are being copied, so as result you get new hash containing the same values. That is why this will work:
hash = {1 => 'Some string'} #Strings are mutable
hash2 = hash.clone
hash2[1] #=> 'Some string'
hash2[1].upcase! # modifying mutual object
hash[1] #=> 'SOME STRING; # so it appears modified on both hashes
hash2[1] = 'Other string' # changing reference on second hash to another object
hash[1] #=> 'SOME STRING' # original obejct has not been changed
hash2[2] = 'new value' # adding obejct to original hash
hash[2] #=> nil
If you want duplicate the referenced objects, you need to perform deep duplication. It is added in rails (activesupport gem) as deep_dup method. If you are not using rails and don;t want to install the gem, you can write it like:
class Hash
def deep_dup
Hash[map {|key, value| [key, value.respond_to?(:deep_dup) ? value.deep_dup : begin
value.dup
rescue
value
end]}]
end
end
hash = {1 => 'Some string'} #Strings are mutable
hash2 = hash.deep_dup
hash2[1] #=> 'Some string'
hash2[1].upcase! # modifying referenced object
hash2[1] #=> 'SOME STRING'
hash[1] #=> 'Some string; # now other hash point to original object's clone
You probably should write something similar for arrays. I would also thought about writing it for whole enumerable module, but it might be slightly trickier.
The easiest way to make a deep copy of most Ruby objects (including strings, arrays, hashes and combinations thereof) is to use Marshal:
def deep_copy(obj)
Marshal.load(Marshal.dump(obj))
end
For example,
h = {a: 1, b: [:c, d: {e: 4}]} # => {:a=>1, :b=>[:c, {:d=>{:e=>4}}]}
hclone = h.clone
hdup = h.dup
hmarshal = deep_copy(h)
h[:b][1][:d][:e] = 5
h # => {:a=>1, :b=>[:c, {:d=>{:e=>5}}]}
hclone # => {:a=>1, :b=>[:c, {:d=>{:e=>5}}]}
hdup # => {:a=>1, :b=>[:c, {:d=>{:e=>5}}]}
hmarshal # => {:a=>1, :b=>[:c, {:d=>{:e=>4}}]}

IRB (apparently) not inspecting hashes correctly

I'm seeing some odd behavior in IRB 1.8.7 with printing hashes. If I initialize my hash with a Hash.new, it appears that my hash is "evaluating" to an empty hash:
irb(main):024:0> h = Hash.new([])
=> {}
irb(main):025:0> h["test"]
=> []
irb(main):026:0> h["test"] << "blah"
=> ["blah"]
irb(main):027:0> h
=> {}
irb(main):028:0> puts h.inspect
{}
=> nil
irb(main):031:0> require 'pp'
=> true
irb(main):032:0> pp h
{}
=> nil
irb(main):033:0> h["test"]
=> ["blah"]
As you can see, the data is actually present in the hash, but trying to print or display it seems to fail. Initialization with a hash literal seems to fix this problem:
irb(main):050:0> hash = { 'test' => ['testval'] }
=> {"test"=>["testval"]}
irb(main):051:0> hash
=> {"test"=>["testval"]}
irb(main):053:0> hash['othertest'] = ['secondval']
=> ["secondval"]
irb(main):054:0> hash
=> {"othertest"=>["secondval"], "test"=>["testval"]}
The issue here is that invoking h["test"] doesn't actually insert a new key into the hash - it just returns the default value, which is the array that you passed to Hash.new.
1.8.7 :010 > a = []
=> []
1.8.7 :011 > a.object_id
=> 70338238506580
1.8.7 :012 > h = Hash.new(a)
=> {}
1.8.7 :013 > h["test"].object_id
=> 70338238506580
1.8.7 :014 > h["test"] << "blah"
=> ["blah"]
1.8.7 :015 > h.keys
=> []
1.8.7 :016 > h["bogus"]
=> ["blah"]
1.8.7 :017 > h["bogus"].object_id
=> 70338238506580
1.8.7 :019 > a
=> ["blah"]
The hash itself is still empty - you haven't assigned anything to it. The data isn't present in the hash - it's present in the array that is returned for missing keys in the hash.
It looks like you're trying to create a hash of arrays. To do so, I recommend you initialize like so:
h = Hash.new { |h,k| h[k] = [] }
Your version isn't working correctly for me, either. The reason why is a little complicated to understand. From the docs:
If obj is specified, this single object will be used for all default values.
I've added the bolding. The rest of the emphasis is as-is.
You're specifying that obj is [], and it's only a default value. It doesn't actually set the contents of the hash to that default value. So when you do h["blah"] << "test", you're really just asking it to return a copy of the default value and then adding "test" to that copy. It never goes into the hash at all. (I need to give Chris Heald credit for explaining this below.)
If instead you give it a block, it calls that block EVERY TIME you do a lookup on a non-existent entry of the hash. So you're not just creating one Array anymore. You're creating one for each entry of the hash.

Ruby Hash.new weirdness [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Strange ruby behavior when using Hash.new([])
This is a simple one, as I'm lost for words.
Why is this happening:
1.9.3-p194 :001 > h = Hash.new([])
=> {}
1.9.3-p194 :002 > h[:key1] << "Ruby"
=> ["Ruby"]
1.9.3-p194 :003 > h
=> {}
1.9.3-p194 :004 > h.keys
=> []
1.9.3-p194 :005 > h[:key1]
=> ["Ruby"]
When you create a hash like this:
h = Hash.new([])
it means, whenever the hash is accessed with a key that has not been defined yet, its going to return:
[]
Now when you do :
h[:key1] << "Ruby"
h[:key1] has returned [] , to which "Ruby" got pushed, resulting in ["Ruby"], as output, as that is the last object returned. That has also got set as the default value to return when 'h' is accessed with an undefined key.
Hence, when you do :
h[:key1] or h[:key2] or h[:whatever]
You will get
"Ruby"
as output.
Hope this helps.
Look at the documentation of Hash.new
new → new_hash
new(obj) → new_hash
new {|hash, key| block } → new_hash
If this hash is subsequently accessed by a key that doesn’t correspond to a hash entry, the value returned depends on the style of new used to create the hash.
In the first form, the access returns nil.
If obj is specified, this single object will be used for all default values.
If a block is specified, it will be called with the hash object and the key, and should return the default value. It is the block’s responsibility to store the value in the hash if required.
irb(main):015:0> h[:abc] # ["Ruby"]
So ["Ruby"] is returned as default value instead of nil if key is not found.
This construction Hash.new([]) returns default value but this value is not initialized value of hash. You're trying to work with hash assuming that the default value is a part of hash.
What you need is construction which will initialize the hash at some key:
hash = Hash.new { |h,k| h[k] = [] }
hash[:key1] << "Ruby"
hash #=> {:key1=>["Ruby"]}
You actually did not set the value with h[:keys] << "Ruby". You just add a value for the returned default array of a not found key. So no key is created.
If you write this, it will be okay:
h = Hash.new([])
h[:keys1] = []
h[:keys1] << "Ruby"
I have to admit this tripped me out too when I read your question. I had a look at the docs and it became clear though.
If obj is specified, this single object will be used for all default values.
So what you actually doing is modifying this one single array object that is used for the default values, without ever assigning to the key!
Check it out:
h = Hash.new([])
h[:x] << 'x'
# => ['x']
h
# => {}
h[:y]
# => ['x'] # CRAZY TIMES
So you need to do assignment somehow - h[:x] += ['x'] might be the way to go.

Hash use array as key in ruby

I have a hash that uses array as its key. When I change the array, the hash can no longer get the corresponding key and value:
1.9.3p194 :016 > a = [1, 2]
=> [1, 2]
1.9.3p194 :017 > b = { a => 1 }
=> {[1, 2]=>1}
1.9.3p194 :018 > b[a]
=> 1
1.9.3p194 :019 > a.delete_at(1)
=> 2
1.9.3p194 :020 > a
=> [1]
1.9.3p194 :021 > b
=> {[1]=>1}
1.9.3p194 :022 > b[a]
=> nil
1.9.3p194 :023 > b.keys.include? a
=> true
What am I doing wrong?
Update:
OK. Use a.clone is absolutely one way to deal with this problem.
What if I want to change "a" but still use "a" to retrieve the corresponding value (since "a" is still one of the keys) ?
The #rehash method will recalculate the hash, so after the key changes do:
b.rehash
TL;DR: consider Hash#compare_by_indentity
You need to decide if you want the hash to work by array value or array identity.
By default arrays .hash and .eql? by value, which is why changing the value confuses ruby. Consider this variant of your example:
pry(main)> a = [1, 2]
pry(main)> a1 = [1]
pry(main)> a.hash
=> 4266217476190334055
pry(main)> a1.hash
=> -2618378812721208248
pry(main)> h = {a => '12', a1 => '1'}
=> {[1, 2]=>"12", [1]=>"1"}
pry(main)> h[a]
=> "12"
pry(main)> a.delete_at(1)
pry(main)> a
=> [1]
pry(main)> a == a1
=> true
pry(main)> a.hash
=> -2618378812721208248
pry(main)> h[a]
=> "1"
See what happened there?
As you discovered, it fails to match on the a key because the .hash value under which it stored it is outdated [BTW, you can't even rely on that! A mutation might result in same hash (rare) or different hash that lands in the same bucket (not so rare).]
But instead of failing by returning nil, it matched on the a1 key.
See, h[a] doesn't care at all about the identity of a vs a1 (the traitor!). It compared the current value you supply — [1] with the value of a1 being [1] and found a match.
That's why using .rehash is just band-aid. It will recompute the .hash values for all keys and move them to the correct buckets, but it's error-prone, and may also cause trouble:
pry(main)> h.rehash
=> {[1]=>"1"}
pry(main)> h
=> {[1]=>"1"}
Oh oh. The two entries collapsed into one, since they now have the same value (and which wins is hard to predict).
Solutions
One sane approach is embracing lookup by value, which requires the value to never change. .freeze your keys. Or use .clone/.dup when building the hash, and feel free to mutate the original arrays — but accept that h[a] will lookup the current value of a against the values preserved from build time.
The other, which you seem to want, is deciding you care about identity — lookup by a should find a whatever its current value, and it shouldn't matter if many keys had or now have the same value.
How?
Object hashes by identity. (Arrays don't because types that .== by value tend to also override .hash and .eql? to be by value.) So one option is: don't use arrays as keys, use some custom class (which may hold an array inside).
But what if you want it to behave directly like a hash of arrays? You could subclass Hash, or Array but it's a lot of work to make everything work consistently. Luckily, Ruby has a builtin way: h.compare_by_identity switches a hash to work by identity (with no way to undo, AFAICT). If you do this before you insert anything, you can even have distinct keys with equal values, with no confusion:
[39] pry(main)> x = [1]
=> [1]
[40] pry(main)> y = [1]
=> [1]
[41] pry(main)> h = Hash.new.compare_by_identity
=> {}
[42] pry(main)> h[x] = 'x'
=> "x"
[44] pry(main)> h[y] = 'y'
=> "y"
[45] pry(main)> h
=> {[1]=>"x", [1]=>"y"}
[46] pry(main)> x.push(7)
=> [1, 7]
[47] pry(main)> y.push(7)
=> [1, 7]
[48] pry(main)> h
=> {[1, 7]=>"x", [1, 7]=>"y"}
[49] pry(main)> h[x]
=> "x"
[50] pry(main)> h[y]
=> "y"
Beware that such hashes are counter-intuitive if you try to put there e.g. strings, because we're really used to strings hashing by value.
Hashes use their key objects' hash codes (a.hash) to group them. Hash codes often depend on the state of the object; in this case, the hash code of a changes when an element has been removed from the array. Since the key has already been inserted into the hash, a is filed under its original hash code.
This means you can't retrieve the value for a in b, even though it looks alright when you print the hash.
You should use a.clone as key
irb --> a = [1, 2]
==> [1, 2]
irb --> b = { a.clone => 1 }
==> {[1, 2]=>1}
irb --> b[a]
==> 1
irb --> a.delete_at(1)
==> 2
irb --> a
==> [1]
irb --> b
==> {[1, 2]=>1} # STILL UNCHANGED
irb --> b[a]
==> nil # Trivial, since a has changed
irb --> b.keys.include? a
==> false # Trivial, since a has changed
Using a.clone will make sure that the key is unchanged even when we change a later on.
As you have already said, the trouble is that the hash key is the exact same object you later modify, meaning that the key changes during program execution.
To avoid this, make a copy of the array to use as a hash key:
a = [1, 2]
b = { a.clone => 1 }
Now you can continue to work with a and leave your hash keys intact.

Resources