Ruby Hash: creating a default value for non-existing elements - ruby

I learned from this answer here that this is possible:
h = Hash.new { |h, k| h[k] = Hash.new(&h.default_proc) }
h['bar'] # => {}
h['tar']['star']['par'] # => {}
Can someone explain how it works?

Hashes have a thing called a default_proc, which is simply a proc that Ruby runs when you try to access a hash key that doesn't exist. This proc receives both the hash itself and the target key as parameters.
You can set a Hash's default_proc at any time. Passing a block parameter to Hash.new simply allows you to initialize a Hash and set its default_proc in one step:
h = Hash.new
h.default_proc = proc{ |hash, key| hash[key] = 'foo' }
# The above is equivalent to:
h = Hash.new{ |hash, key| hash[key] = 'foo' }
We can also access the default proc for a hash by calling h.default_proc. Knowing this, and knowing that the ampersand (&) allows a proc passed as a normal parameter to be treated as a block parameter, we can now explain how this code works:
cool_hash = Hash.new{ |h, k| h[k] = Hash.new(&h.default_proc) }
The block passed to Hash.new will be called when we try to access a key that doesn't exist. This block will receive the hash itself as h, and the key we tried to access as k. We respond by setting h[k] (that is, the value of the key we're trying to access) to a new hash. Into the constructor of this new hash, we pass the "parent" hash's default_proc, using an ampersand to force it to be interpreted as a block parameter. This is the equivalent of doing the following, to an infinite depth:
cool_hash = Hash.new{ |h, k| h[k] = Hash.new{ |h, k| h[k] = Hash.new{ ... } } }
The end result is that the key we tried to access was initialized to a new Hash, which itself will initialize any "not found" keys to a new Hash, which itself will have the same behavior, etc. It's hashes all the way down.

In this code you create hashes by chain, so that any link of chain would have same default_proc
So, default_proc of h and h['bar'] and so far will be the same - it will return new instance of Hash with this default_proc

Related

Why do default Arrays in Hash.new require key value specification? [duplicate]

This question already has answers here:
Strange, unexpected behavior (disappearing/changing values) when using Hash default value, e.g. Hash.new([])
(4 answers)
Closed 1 year ago.
If...
variable = Hash.new(0)
...will default to new values being the integer zero without having to specify the associated key, why do I have to use a block and specify the associated key for the new values to default to an array, like so...
variable = Hash.new { |h, k| h[k] = [] }
I read ruby-doc.org but can't seem to find an answer. Perhaps its "under the hood" and I can't see/comprehend it.
For context, the question came up when I couldn't reconcile why the first method didn't work and the second method did:
def find_duplicates1(array)
indices = Hash.new([])
array.each_with_index { |ele, i| indices[ele] << i }
indices.select { |ele, indices| indices.length > 1 }
end
def find_duplicates2(array)
indices = Hash.new { |h, k| h[k] = [] }
array.each_with_index { |ele, i| indices[ele] << i }
indices.select { |ele, indices| indices.length > 1 }
end
Because indices = Hash.new([]) means that when calling it with an unknown key then the [] will be returned. But that empty default array will not be assigned to the former unknown key.
Here an example:
indices = Hash.new([])
indices[:foo] << :bar
indeces
#=> {}
But even worse, because we added a value to the default hash that hash is now not empty anymore and will return the changed default value for all other unknown keys too:
indices[:baz]
#=> [:bar]
Whereas indices = Hash.new { |h, k| h[k] = [] } means that the block will run for all unknown keys and within the block, a new empty array is initialized and that new array is actually assigned to the former unknown key.
indices = Hash.new { |h, k| h[k] = [] }
indices[:foo] << :bar
indices
#=> {:foo=>[:bar]}
indices[:bar]
#=> []
Btw you might be interested in the Enumerable#tally method. By using it your method can be simplified to:
def find_duplicates(array)
array.tally.select { |k, v| v > 1 }.keys
end
It's because the default (whatever object it is) is used as the default. That object will be presented for EVERY undefined instance. They're all pointing to the same object.
For immutable objects (like the integer 0) it doesn't matter because if you replace 0 with 1 for a given key, then the key is pointing to a new object (the integer 1).
But if it's an array object and you "mutate" (change) it like array << "added" then that object... now with added "added", is the default for all future new keys and is likely the object that all existing keys are pointing to. All keys point to the single array object that looks like: ["added"]
By using a block, you are defaulting a NEW array object to the key. If you change the array object by adding an element, the other keys' objects are unchanged (they're different objects).

Check hash if key exists; if not, create array and append value

Example as follows:
if house['windows'][floor_1]
house['windows'][floor_1] << north_side
else
house['windows'][floor_1] = [north_side]
end
Best way to check for existing key?
The fact that house['windows'] is an element in a hash already is a bit of a red herring, so I will use windows as a variable referencing a hash.
Set up a default value for the windows hash, so that any non-preexisting key is assigned an array value:
windows = Hash.new {|hash, key| hash[key] = [] }
Now you can append (<<) to new hash elements automatically.
windows['floor_1'] << 'north_side'
windows # => {"floor_1"=>["north_side"]}
For your specific case, replace windows with house['windows'].
EDIT
As pointed out in the comments, this behavior can be added to an already-instantiated hash:
windows.default_proc = proc {|hash, key| hash[key] = [] }
I would do something like:
house['windows'][floor_1] ||= []
house['windows'][floor_1] << north_side
Given your Hash, I imagine:
house = { windows: { floor_0: ['f0'] } }
You can check the existence of a key using Hash#has_key?
house[:windows].has_key? :floor_1 #=> false
So you can create it:
house[:windows].merge!({floor_1: []}) unless house[:windows].has_key? :floor_1
Better if you define a defalt value using for example Hash#default_proc=:
house[:windows].default_proc = proc { |h, k| h[k] = [] }
So you can
house[:windows][:floor_3] << 'f3'
house #=> {:windows=>{:floor_0=>["f0"], :floor_1=>[], :floor_3=>["f3"]}}

Create the new hash from the old one with Ruby

I have some simple_hash:
old_hash = {"New"=>"0"}
I wan to convert it to new format:
new_hash = old_hash.keys.each do |key|
hash = Hash.new
hash[key] = {count: old_hash[key]}
hash
end
but this code returns me:
["New"]
instead of:
{"New"=>{:count=>"0"}}
And the question is why?
You are confusing the syntax of block with that of a method. In your code, new_hash gets the value of old_hash.keys, which is not what you want.
A little modification works:
new_hash = Hash.new
old_hash.keys.each do |key|
new_hash[key] = {count: old_hash[key]}
end
Do this:
hash = Hash.new
new_hash = old_hash.keys.each do |key|
hash[key] = {count: old_hash[key]}
hash
end
hash
# => {"New"=>{:count=>"0"}}
Since you placed hash = Hash.new inside the loop, you are creating a new hash every time.

How do I add an object to an array, where the array is a value to a key in a hash?

So basically my code is as follows
anagrams = Hash.new([])
self.downcase.scan(/\b[a-z]+/i).each do |key|
anagrams[key.downcase.chars.sort] = #push key into array
end
so basically the hash would look like this
anagrams = { "abcdef" => ["fdebca", "edfcba"], "jklm" => ["jkl"]}
Basically what I don't understand is how to push "key" (which is obviously a string) as the value to "eyk"
I've been searching for awhile including documentation and other stackflow questions and this was my best guess
anagrams[key.downcase.chars.sort].push(key)
Your guess:
anagrams[key.downcase.chars.sort].push(key)
is right. The problem is your hash's default value:
anagrams = Hash.new([])
A default value doesn't automatically create an entry in the hash when you reference it, it just returns the value. That means that you can do this:
h = Hash.new([])
h[:k].push(6)
without changing h at all. The h[:k] gives you the default value ([]) but it doesn't add :k as a key. Also note that the same default value is used every time you try to access a key that isn't in the hash so this:
h = Hash.new([])
a = h[:k].push(6)
b = h[:x].push(11)
will leave you with [6,11] in both a and b but nothing in h.
If you want to automatically add defaults when you access them, you'll need to use a default_proc, not a simple default:
anagrams = Hash.new { |h, k] h[k] = [ ] }
That will create the entries when you access a non-existent key and give each one a different empty array.
It's not entirely clear what your method is supposed to do, but I think the problem is that you don't have an array to push a value onto.
In Ruby you can pass a block to Hash.new that tells it what to do when you try to access a key that doesn't exist. This is a handy way to automatically initialize values as empty arrays. For example:
hsh = Hash.new {|hsh, key| hsh[key] = [] }
hsh[:foo] << "bar"
p hsh # => { :foo => [ "bar" ] }
In your method (which I assume you're adding to the String class), you would use it like this:
class String
def my_method
anagrams = Hash.new {|hsh, key| hsh[key] = [] }
downcase.scan(/\b[a-z]+/i).each_with_object(anagrams) do |key|
anagrams[key.downcase.chars.sort.join] << key
end
end
end

Best way to check for nil and update hash value

I have a hash where the values are all arrays. I want to look up a key in this hash. If it exists I want to add a value to the array. If it does not exist (hash[key] returns nil) then I need to create the array and and add one value. Currently I have this:
hash[key].push elem unless hash[key].nul?
hash[key] ||= [elem]
This involves 3 lookups. I'm new to ruby so I'm sure there's a better way to do this. What is it?
My original plan was to make the default value for the hash [ ]. Then I can just use:
hash[key].push elem
Unfortunately if the key does not exist, that will only change the default value and not add a new key.
In this case you need to create a hash as below :
hash = Hash.new { |h,k| h[k] = [] }
The above is created to handle situations like your. Look new {|hash, key| block } → new_hash
hash = Hash.new { |h,k| h[k] = [] }
hash[:key1] << 1
hash[:key2] << 2
hash[:key1] << 3
hash # => {:key1=>[1, 3], :key2=>[2]}
You can try:
(hash[key] ||= []) << elem
However Arup's answer is much better.
You should create your hash with a default value.
hash = Hash.new { |h,k| h[k] = [] }
hash[key].push elem

Resources