Hashes of Hashes Idiom in Ruby? - ruby

Creating hashes of hashes in Ruby allows for convenient two (or more) dimensional lookups. However, when inserting one must always check if the first index already exists in the hash. For example:
h = Hash.new
h['x'] = Hash.new if not h.key?('x')
h['x']['y'] = value_to_insert
It would be preferable to do the following where the new Hash is created automatically:
h = Hash.new
h['x']['y'] = value_to_insert
Similarly, when looking up a value where the first index doesn't already exist, it would be preferable if nil is returned rather than receiving an undefined method for '[]' error.
looked_up_value = h['w']['z']
One could create a Hash wrapper class that has this behavior, but is there an existing a Ruby idiom for accomplishing this task?

You can pass the Hash.new function a block that is executed to yield a default value in case the queried value doesn't exist yet:
h = Hash.new { |h, k| h[k] = Hash.new }
Of course, this can be done recursively. There's an article explaining the details.
For the sake of completeness, here's the solution from the article for arbitrary depth hashes:
hash = Hash.new(&(p = lambda{|h, k| h[k] = Hash.new(&p)}))
The person to originally come up with this solution is Kent Sibilev.

Autovivification, as it's called, is both a blessing and a curse. The trouble can be that if you "look" at a value before it's defined, you're stuck with this empty hash in the slot and you would need to prune it off later.
If you don't mind a bit of anarchy, you can always just jam in or-equals style declarations which will allow you to construct the expected structure as you query it:
((h ||= { })['w'] ||= { })['z']

Related

How would I remove multiple nested values from a hash

I've got a really confusing question that I can't seem to figure out.
Say I have this hash
hash = {
"lock_version"=>1,
"exhibition_quality"=>false,
"within"=>["FID6", "S5"],
"representation_file"=>{
"lock_version"=>0,
"created_by"=>"admin",
"within"=>["FID6", "S5"]
}
}
How can I delete from "within"=>["FID6", "S5"] a value with the pattern FID<Number> (in this example FID6)?
I've thought about it a bit and used the .delete, .reject! the within but I realised this was deleting the whole key value pair which is not what I want. Thanks a lot for any help that can put me on the right path.
You could use a method or proc/lambda to achieve the result. This solutions splits up the logic in two parts. Removing the actual FID<Number> string from the array and recursively calling the former on the correct key.
remove_fid = ->(array) { array.grep_v(/\AFID\d+\z/) }
remove_nested_fid = lambda do |hash|
hash.merge(
hash.slice('within').transform_values(&remove_fid),
hash.slice('representation_file').transform_values(&remove_nested_fid)
)
end
pp hash.then(&remove_nested_fid) # or remove_nested_fid.call(hash)
# {"lock_version"=>1,
# "exhibition_quality"=>false,
# "within"=>["S5"],
# "representation_file"=>
# {"lock_version"=>0, "created_by"=>"admin", "within"=>["S5"]}}
grep_v removes all strings from the array that do not match the given regex.
slice creates a new hash only containing the given keys. If a key is missing it will not be present in the resulting hash.
transform_values transforms the values of a hash into a new value (similar to map for Array), returning a hash.
merge creates a new hash, merging the hashes together.
This solution does not mutate the original hash structure.
You're going to need to recurse over the Hash and, in cases where the value is a Hash, process it using the same function, repeatedly. This can be problematic in Ruby -- which doesn't handle recursion very well -- if the depth of the tree is too deep, but it's the most natural way to express this type of issue.
def filter_fids(h)
h.each_pair do |k, v|
if v.is_a? Hash
filter_fids v
elsif k == "within"
v.reject! { |x| x.start_with? "FID" }
end
end
end
This will mutate the original data structure in place.

In Ruby, how to set a default value for a nested hash?

I recently looked for a way to correctly create and use nested hashes in Ruby. I promptly found a solution by Paul Morie, who answered his own question:
hash = Hash.new { |h,k| h[k] = {} }
I promptly went to use this and am glad to report it works. However, as the title says, I'd like the "secondary", "inside" hashes to return 0 by default.
I'm aware that you can define the default return value of a hash both in its constructor ("Hash.new(0)") or using .default ("hash.default(0)").
But how would you do this with hashes inside a hash?
Apparently I only had to do:
hash = Hash.new { |h,k| h[k] = Hash.new(0) }
Whoops. I'll try not to be so hasty to ask a question next time.

Ruby Hash destructive vs. non-destructive method

Could not find a previous post that answers my question...I'm learning how to use destructive vs. non-destructive methods in Ruby. I found an answer to the exercise I'm working on (destructively adding a number to hash values), but I want to be clear on why some earlier solutions of mine did not work. Here's the answer that works:
def modify_a_hash(the_hash, number_to_add_to_each_value)
the_hash.each { |k, v| the_hash[k] = v + number_to_add_to_each_value}
end
These two solutions come back as non-destructive (since they all use "each" I cannot figure out why. To make something destructive is it the equals sign above that does the trick?):
def modify_a_hash(the_hash, number_to_add_to_each_value)
the_hash.each_value { |v| v + number_to_add_to_each_value}
end
def modify_a_hash(the_hash, number_to_add_to_each_value)
the_hash.each { |k, v| v + number_to_add_to_each_value}
end
The terms "destructive" and "non-destructive" are a bit misleading here. Better is to use the conventional "in-place modification" vs. "returns a copy" terminology.
Generally methods that modify in-place have ! at the end of their name to serve as a warning, like gsub! for String. Some methods that pre-date this convention do not have them, like push for Array.
The = performs an assignment within the loop. Your other examples don't actually do anything useful since each returns the original object being iterated over regardless of any results produced.
If you wanted to return a copy you'd do this:
def modify_a_hash(the_hash, number_to_add)
Hash[
the_hash.collect do |k, v|
[ k, v + number_to_add ]
end
]
end
That would return a copy. The inner operation collect transforms key-value pairs into new key-value pairs with the adjustment applied. No = is required since there's no assignment.
The outer method Hash[] transforms those key-value pairs into a proper Hash object. This is then returned and is independent of the original.
Generally a non-destructive or "return a copy" method needs to create a new, independent version of the thing it's manipulating for the purpose of storing the results. This applies to String, Array, Hash, or any other class or container you might be working with.
Maybe this slightly different example will be helpful.
We have a hash:
2.0.0-p481 :014 > hash
=> {1=>"ann", 2=>"mary", 3=>"silvia"}
Then we iterate over it and change all the letters to the uppercase:
2.0.0-p481 :015 > hash.each { |key, value| value.upcase! }
=> {1=>"ANN", 2=>"MARY", 3=>"SILVIA"}
The original hash has changed because we used upcase! method.
Compare to method without ! sign, that doesn't modify hash values:
2.0.0-p481 :017 > hash.each { |key, value| value.downcase }
=> {1=>"ANN", 2=>"MARY", 3=>"SILVIA"}

Can someone explain to me in detail what is going on in this Ruby block?

Learning Ruby. Needed to create a hash of arrays. This works... but I don't quite understand what Ruby is doing. Could someone explain it in detail?
months = Hash.new{|h, k| h[k] = []}
This uses the Hash.new constructor to create a hash whose items default to the empty list. You can therefore do something like this:
months[2012] << 'January'
and the array will be created without you doing a months[2012] = [] first.
Short explanation: { |h, k| h[k] = [] } Is a Ruby block and as mu has mentioned, it can to some degree be compared to function in Javascript or lambda in Python. It basically is an anonymous piece of code that takes two arguments (h, k, the pipes only have the meaning to separate parameters from code) and returns a value. In this case it takes a hash and a key and sets the value of the key to a new array []. It returns a reference to the hash (but that's not important in this context).
Now to Hash.new: It is the constructor for a Hash, of which I assume you already now what it is. It optionally takes a block as an argument that is called whenever a key is accessed that does not yet exist in the Hash. In the above example, the key 2012 was not accessed before, so the block is called, which creates a new array and assigns it to the key. Afterwards, the << operator can work with that array instance.

Why does hash.keys.class return Arrays?

New to Ruby, I'm just missing something basic here. Are the keys in a Hash considered an Array unto themselves?
Yes, Hash#keys returns the hash's keys as a new array, i.e., the hash and the array returned by Hash#keys are completely independent of each other:
a = {}
b = a.keys
c = a.keys
b << :foo
a # still {}
b # [:foo]
c # still []
a[:bar] = :baz
a # {:bar => :baz}
b # still [:foo]
c # still []
From the documentation of hash.keys:
Returns a new array populated with the keys from this hash. See also Hash#values.
So the class is Array because the return value is an array.
About your question "Are the keys in a Hash considered an Array unto themselves?", they "kind" of are, hashes in Ruby are implemented as struct (st_table) which contains a list of pointers to each of its entries(st_table_entry),the st_table_entry contains the key and its value, so I guess what the keys method does is just transversing that list taking out each of the keys.
You can read this article of Ilya Grigorik where he explains much better Hashes in Ruby http://www.igvita.com/2009/02/04/ruby-19-internals-ordered-hash/
Do you think there's something paradoxical about this? Keep in mind that hashes aren't arrays in Ruby.

Resources