Hash use array as key in ruby - ruby

I have a hash that uses array as its key. When I change the array, the hash can no longer get the corresponding key and value:
1.9.3p194 :016 > a = [1, 2]
=> [1, 2]
1.9.3p194 :017 > b = { a => 1 }
=> {[1, 2]=>1}
1.9.3p194 :018 > b[a]
=> 1
1.9.3p194 :019 > a.delete_at(1)
=> 2
1.9.3p194 :020 > a
=> [1]
1.9.3p194 :021 > b
=> {[1]=>1}
1.9.3p194 :022 > b[a]
=> nil
1.9.3p194 :023 > b.keys.include? a
=> true
What am I doing wrong?
Update:
OK. Use a.clone is absolutely one way to deal with this problem.
What if I want to change "a" but still use "a" to retrieve the corresponding value (since "a" is still one of the keys) ?

The #rehash method will recalculate the hash, so after the key changes do:
b.rehash

TL;DR: consider Hash#compare_by_indentity
You need to decide if you want the hash to work by array value or array identity.
By default arrays .hash and .eql? by value, which is why changing the value confuses ruby. Consider this variant of your example:
pry(main)> a = [1, 2]
pry(main)> a1 = [1]
pry(main)> a.hash
=> 4266217476190334055
pry(main)> a1.hash
=> -2618378812721208248
pry(main)> h = {a => '12', a1 => '1'}
=> {[1, 2]=>"12", [1]=>"1"}
pry(main)> h[a]
=> "12"
pry(main)> a.delete_at(1)
pry(main)> a
=> [1]
pry(main)> a == a1
=> true
pry(main)> a.hash
=> -2618378812721208248
pry(main)> h[a]
=> "1"
See what happened there?
As you discovered, it fails to match on the a key because the .hash value under which it stored it is outdated [BTW, you can't even rely on that! A mutation might result in same hash (rare) or different hash that lands in the same bucket (not so rare).]
But instead of failing by returning nil, it matched on the a1 key.
See, h[a] doesn't care at all about the identity of a vs a1 (the traitor!). It compared the current value you supply — [1] with the value of a1 being [1] and found a match.
That's why using .rehash is just band-aid. It will recompute the .hash values for all keys and move them to the correct buckets, but it's error-prone, and may also cause trouble:
pry(main)> h.rehash
=> {[1]=>"1"}
pry(main)> h
=> {[1]=>"1"}
Oh oh. The two entries collapsed into one, since they now have the same value (and which wins is hard to predict).
Solutions
One sane approach is embracing lookup by value, which requires the value to never change. .freeze your keys. Or use .clone/.dup when building the hash, and feel free to mutate the original arrays — but accept that h[a] will lookup the current value of a against the values preserved from build time.
The other, which you seem to want, is deciding you care about identity — lookup by a should find a whatever its current value, and it shouldn't matter if many keys had or now have the same value.
How?
Object hashes by identity. (Arrays don't because types that .== by value tend to also override .hash and .eql? to be by value.) So one option is: don't use arrays as keys, use some custom class (which may hold an array inside).
But what if you want it to behave directly like a hash of arrays? You could subclass Hash, or Array but it's a lot of work to make everything work consistently. Luckily, Ruby has a builtin way: h.compare_by_identity switches a hash to work by identity (with no way to undo, AFAICT). If you do this before you insert anything, you can even have distinct keys with equal values, with no confusion:
[39] pry(main)> x = [1]
=> [1]
[40] pry(main)> y = [1]
=> [1]
[41] pry(main)> h = Hash.new.compare_by_identity
=> {}
[42] pry(main)> h[x] = 'x'
=> "x"
[44] pry(main)> h[y] = 'y'
=> "y"
[45] pry(main)> h
=> {[1]=>"x", [1]=>"y"}
[46] pry(main)> x.push(7)
=> [1, 7]
[47] pry(main)> y.push(7)
=> [1, 7]
[48] pry(main)> h
=> {[1, 7]=>"x", [1, 7]=>"y"}
[49] pry(main)> h[x]
=> "x"
[50] pry(main)> h[y]
=> "y"
Beware that such hashes are counter-intuitive if you try to put there e.g. strings, because we're really used to strings hashing by value.

Hashes use their key objects' hash codes (a.hash) to group them. Hash codes often depend on the state of the object; in this case, the hash code of a changes when an element has been removed from the array. Since the key has already been inserted into the hash, a is filed under its original hash code.
This means you can't retrieve the value for a in b, even though it looks alright when you print the hash.

You should use a.clone as key
irb --> a = [1, 2]
==> [1, 2]
irb --> b = { a.clone => 1 }
==> {[1, 2]=>1}
irb --> b[a]
==> 1
irb --> a.delete_at(1)
==> 2
irb --> a
==> [1]
irb --> b
==> {[1, 2]=>1} # STILL UNCHANGED
irb --> b[a]
==> nil # Trivial, since a has changed
irb --> b.keys.include? a
==> false # Trivial, since a has changed
Using a.clone will make sure that the key is unchanged even when we change a later on.

As you have already said, the trouble is that the hash key is the exact same object you later modify, meaning that the key changes during program execution.
To avoid this, make a copy of the array to use as a hash key:
a = [1, 2]
b = { a.clone => 1 }
Now you can continue to work with a and leave your hash keys intact.

Related

Ruby Hash populated from Array.product yields unexpected behavior

I wanted to pre-populate a Hash, given an array of keys and a default value (an empty array). I attempted to do this using the #product method of Array.
> hash = Hash[[:foo, :bar].product([[]])] # => {:foo=>[], :bar=>[]}
> hash[:foo].push(:baz) # => {:foo=>[:baz], :bar=>[:baz]}
I don't understand why the value is being applied to all keys in the hash. If instead, I use the returned value of product and populate the hash directly from that, I get expected behavior.
> [:foo, :bar].product([[]]) # => [[:foo, []], [:bar, []]]
> hash = Hash[[[:foo, []], [:bar, []]]] # => {:foo=>[], :bar=>[]}
> hash[:foo].push(:baz) # => {:foo=>[:baz], :bar=>[]}
I am using ruby 2.3.6
It's because the arrays that you pass to your hash initializer are the same object, so if you modify said object, the changes will be present everywhere it is used:
> hash = Hash[[:foo, :bar].product([[]])]
# => {:foo=>[], :bar=>[]}
> hash[:foo].object_id
# => 47106586247680
> hash[:bar].object_id
# => 47106586247680
If you copy-paste the output of your product, you're using 2 different arrays as they get instantiated separately.

What are values mapped to?

I don't understand how Ruby hashes work.
I expect these:
a = 'a'
{a => 1}[a] # => 1
{a: 1}[:a] # => 1
{2 => 1}[2] # => 1
How does this work?
{'a' => 1}['a'] # => 1
The first string 'a' is not the same object as the second string 'a'.
Ruby doesn't use object equality (equal?) for comparing hash keys. It wouldn't be very useful if it did after all.
Instead it uses eql?, which for strings is the same as ==
As a footnote to other answers, you can let a hash behave like you expected:
h = {'a'=> 1}
p h['a'] #=> 1
h.compare_by_identity
p h['a'] #=> nil ; not the same object
some_hash[k] = v
Basically, when you do this, what is stored is not a direct association k => v. Instead of that, k is asked for a hash code, which is then used to map to v.
Equal values yield equal hash codes. That's why your last example works the way it does.
A couple of examples:
1.9.3p0 :001 > s = 'string'
=> "string"
1.9.3p0 :002 > 'string'.hash
=> -895223107629439507
1.9.3p0 :003 > 'string'.hash == s.hash
=> true
1.9.3p0 :004 > 2.hash
=> 2271355725836199018
1.9.3p0 :005 > nil.hash
=> 2199521878082658865

Ruby: Properties of a hash key

I will just paste down a simple example i tried so that it would be clear to those who read this.
irb(main):001:0> h = { }
=> {}
irb(main):002:0> a=[1,2,3]
=> [1, 2, 3]
irb(main):003:0> a.object_id
=> 69922343540500
irb(main):004:0> h[a] = 12 #Hash with the array as a key
=> 12
irb(main):005:0> a << 4 #Modified the array
=> [1, 2, 3, 4]
irb(main):006:0> a.object_id #Object id obviously remains the same.
=> 69922343540500
irb(main):007:0> h[a] #Hash with the same object_id now returns nil.
=> nil
irb(main):008:0> h #Modified hash
=> {[1, 2, 3, 4]=>12}
irb(main):009:0> h[[1,2,3,4]] #Tried to access the value with the modified key -
=> nil
irb(main):011:0> h.each { |key,value| puts "#{key.inspect} maps #{value}" }
[1, 2, 3, 4] maps 12
=> {[1, 2, 3, 4]=>12}
Now when i iterate over the hash, its possible to identify the map between the key and the value.
Could some one please explain me this behaviour of ruby hash and what are the properties of hash keys.
1) As i mentioned above, the object_id hasn't changed - then why is the value set to nil.
2) Is there any possible way so that i can get back the value '12' from the hash 'h' because h[[1,2,3,4]] as mentioned above returns nil.
This happens because the key should not have its value changed while it is in use. If the value changes, we should rebuild the hash, based on its current value. Look at Ruby API for rehash method. You can get back the value, by rebuilding the hash again after the key is changed, like this:
irb(main):022:0> h.rehash
=> {[1, 2, 3, 4]=>12}
irb(main):023:0> h[a]
=> 12
Hash keys are checked using the #eql? method, and since [1, 2, 3] isn't .eql? to [1, 2, 3,4] your hash lookup has a different result.
Maybe you want to be using something other than an Array as your Hash key if the semantics aren't working for you?
The ruby hash api provides the answer: the key should not have its value changed while it is in use as a key.
I guess intern a hash is calculated for a and used for quick lookup (because a key shouldn't change, the hash is always the same). So when you do h[a] it doesn't find a match ([1,2,3].hash != [1,2,3,4].hash) and when you do h[[1,2,3]] the hashes match but the object doesn't match ([1,2,3] != [1,2,3,4]).
A fix is to use the object_id as key because it doesn't change, h[a.object_id] = 12 will return 12 when a changes. Ofcourse this has the downside that h[[1,2,3].object_id] won't return 12.
Stefaan Colman's answer is more thorough, but a couple of observations:
Ruby uses the Object#hash method to hash objects.
You can get 12 back out like by doing a.delete(4); h[a] at that point, [1,2,3] is also able to be used as a key again.

Why does clearing my hash, also clear my array of hashes?

ruby-1.9.2-p180 :154 > a = []
=> []
ruby-1.9.2-p180 :154 > h = {:test => "test"}
=> {:test=>"test"}
ruby-1.9.2-p180 :155 > a << h
=> [{:test=>"test"}]
ruby-1.9.2-p180 :156 > h.clear
=> {}
ruby-1.9.2-p180 :157 > a
=> [{}]
I'm very confused, especially since I can change the elements of the hash without it affecting the array. But when I clear the hash the array is updated and cleared of its hash contents. Can someone explain?
When you do a << h, you are really passing the reference of h to a. So when you update h, a also see's those changes because it contains a reference rather than a copy of that value.
In order for it not to change in a, you must pass a cloned value of h into a.
An example would be:
a << h.clone
Ruby does not make a copy of this hash when you add it to the array — it simply stores a reference to the original variable. So, when you empty the original variable, the reference stored in the array now refers to the empty hash.
If you want to copy the hash element so this does not occur, use Ruby's clone method.
ruby-1.9.2-p136 :049 > h = { :test => 'foo' }
=> {:test=>"foo"}
ruby-1.9.2-p136 :050 > a = []
=> []
ruby-1.9.2-p136 :051 > a << h.clone
=> [{:test=>"foo"}]
ruby-1.9.2-p136 :052 > h.clear
=> {}
ruby-1.9.2-p136 :053 > a
=> [{:test=>"foo"}]

Ruby value of a hash key?

I've got a list of values that are in a Ruby hash. Is there a way to check the value of the key and if it equals "X", then do "Y"?
I can test to see if the hash has a key using hash.has_key?, but now I need to know if hash.key == "X" then...?
Hashes are indexed using the square brackets ([]). Just as arrays. But instead of indexing with the numerical index, hashes are indexed using either the string literal you used for the key, or the symbol.
So if your hash is similar to
hash = { "key1" => "value1", "key2" => "value2" }
you can access the value with
hash["key1"]
or for
hash = { :key1 => "value1", :key2 => "value2"}
or the new format supported in Ruby 1.9
hash = { key1: "value1", key2: "value2" }
you can access the value with
hash[:key1]
This question seems to be ambiguous.
I'll try with my interpretation of the request.
def do_something(data)
puts "Found! #{data}"
end
a = { 'x' => 'test', 'y' => 'foo', 'z' => 'bar' }
a.each { |key,value| do_something(value) if key == 'x' }
This will loop over all the key,value pairs and do something only if the key is 'x'.
As an addition to e.g. #Intrepidd s answer, in certain situations you want to use fetch instead of []. For fetch not to throw an exception when the key is not found, pass it a default value.
puts "ok" if hash.fetch('key', nil) == 'X'
Reference: https://docs.ruby-lang.org/en/2.3.0/Hash.html .
How about this?
puts "ok" if hash_variable["key"] == "X"
You can access hash values with the [] operator
It seems that your question is maybe a bit ambiguous.
If “values” in the first sentence means any generic value (i.e. object, since everything in Ruby can be viewed as an object), then one of the other answers probably tells you what you need to know (i.e. use Hash#[] (e.g. hash[some_key]) to find the value associated with a key).
If, however, “values” in first sentence is taken to mean the value part of the “key, value pairs” (as are stored in hashes), then your question seems like it might be about working in the other direction (key for a given value).
You can find a key that leads to a certain value with Hash#key.
ruby-1.9.2-head :001 > hash = { :a => '1', :b => :two, :c => 3, 'bee' => :two }
=> {:a=>"1", :b=>:two, :c=>3, "bee"=>:two}
ruby-1.9.2-head :002 > a_value = :two
=> :two
ruby-1.9.2-head :003 > hash.key(a_value)
=> :b
If you are using a Ruby earlier than 1.9, you can use Hash#index.
When there are multiple keys with the desired value, the method will only return one of them. If you want all the keys with a given value, you may have to iterate a bit:
ruby-1.9.2-head :004 > hash[:b] == hash['bee']
=> true
ruby-1.9.2-head :005 > keys = hash.inject([]) do # all keys with value a_value
ruby-1.9.2-head :006 > |l,kv| kv[1] == a_value ? l << kv[0] : l
ruby-1.9.2-head :007?> end
=> [:b, "bee"]
Once you have a key (the keys) that lead to the value, you can compare them and act on them with if/unless/case expressions, custom methods that take blocks, et cetera. Just how you compare them depends on the kind of objects you are using for keys (people often use strings and symbols, but Ruby hashes can use any kind of object as keys (as long as they are not modified while they serve as keys)).
I didn't understand your problem clearly but I think this is what you're looking for(Based on my understanding)
person = {"name"=>"BillGates", "company_name"=>"Microsoft", "position"=>"Chairman"}
person.delete_if {|key, value| key == "name"} #doing something if the key == "something"
Output: {"company_name"=>"Microsoft", "position"=>"Chairman"}

Resources