is there a convenience way to build nested hash in ruby [duplicate] - ruby

This question already has answers here:
Ruby - Access multidimensional hash and avoid access nil object [duplicate]
(3 answers)
Closed 8 years ago.
I have several vars such as year, month, day, hour, minute. I need to do some operations on a nested hash with these values like hash[2012][12][12][4][3]. When meet a new date, I need to extend the hash.
I can do this by judging each level one by one, and create a hash if not exists. But I'm wondering there is a convenience way to do it.
In perl, I can just call $hash{$v1}{$v2}{$v3}..{$vN} = something because hash not defined can be created by default.

I agree with sawa's comment. You will probably have many other problems with such approach. However, what you want is possible.
Search for "ruby hash default value".
By default, the default value of Hash entry is nil. You can change that, ie:
h = Hash.new(5)
and now the h when asked for non-existing key will return 5, not nil. You can order it to return a new empty array, or new empty hash in a similar way.
But, be careful. It is easy to accidentally SHARE the default instance through all entries.
h = Hash.new([]) // default value = Array object, let's name it X
one = h[:dad] // returns THE SAME object X
two = h[:mom] // returns THE SAME object X
You must be careful to not use the shared-default-instance, and to use operations that will not mutate it. You cannot just
h[:mom] << 'thing'
as the h[:brandnewone] will now return mutated default instance with "thing" inside.
See here for a good explanation and proper usage examples
or, even better: example of autovivifying hash

You could add a helper method, which you might find useful in other contexts:
def mfetch(hash, *keys)
return nil if (keys.empty? || !hash[keys.first])
return hash[keys.first] if keys.size == 1
k = keys.shift
raise ArgumentError, "Too many keys" unless hash[k].is_a? Hash
return mfetch(hash[k], *keys)
end
h = {cat: {dog: {pig: 'oink'}}} # => {:cat=>{:dog=>{:pig=>"oink"}}}
mfetch(h, :cat, :dog, :pig) # => "oink"
mfetch(h, :cat, :dog) # => {:pig=>"oink"}
mfetch(h, :cat) # => {:dog=>{:pig=>"oink"}}
mfetch(h, :cow) # => nil
mfetch(h, :cat, :cow) # => nil
mfetch(h, :cat, :dog, :cow) # => nil
mfetch(h, :cat, :dog, :pig, :cow) # => ArgumentError: Too many keys
If you preferred, you could instead add the method to the Hash class:
class Hash
def mfetch(*keys)
return nil if (keys.empty? || !hash[keys.first])
return self[keys.first] if keys.size == 1
k = keys.shift
raise ArgumentError, "Too many keys" unless self[k].is_a? Hash
return self[k].mfetch(*keys)
end
end
h.mfetch(:cat, :dog, :pig) # => "oink"
or if you are using Ruby 2.0, replace class Hash with refine Hash do to limit the addition to the current class. It might be convenient to put it in a module to be included.

Related

Ruby Set with custom class to equal basic strings

I want to be able to find a custom class in my set given just a string. Like so:
require 'set'
Rank = Struct.new(:name, keyword_init: true) {
def hash
name.hash
end
def eql?(other)
hash == other.hash
end
def ==(other)
hash == other.hash
end
}
one = Rank.new(name: "one")
two = Rank.new(name: "two")
set = Set[one, two]
but while one == "one" and one.eql?("one") are both true, set.include?("one") is still false. what am i missing?
thanks!
Set is built upon Hash, and Hash considers two objects the same if:
[...] their hash value is identical and the two objects are eql? to each other.
What you are missing is that eql? isn't necessarily commutative. Making Rank#eql? recognize strings doesn't change the way String#eql? works:
one.eql?('one') #=> true
'one'.eql?(one) #=> false
Therefore it depends on which object is the hash key and which is the argument to include?:
Set['one'].include?(one) #=> true
Set[one].include?('one') #=> false
In order to make two objects a and b interchangeable hash keys, 3 conditions have to be met:
a.hash == b.hash
a.eql?(b) == true
b.eql?(a) == true
But don't try to modify String#eql? – fiddling with Ruby's core classes isn't recommended and monkey-patching probably won't work anyway because Ruby usually calls the C methods directly for performance reasons.
In fact, making both hash and eql? mimic name doesn't seem like a good idea in the first place. It makes the object's identity ambiguous which can lead to very strange behavior and hard to find bugs:
h = { one => 1, 'one' => 1 }
#=> {#<struct Rank name="one">=>1, "one"=>1}
# vs
h = { 'one' => 1, one => 1 }
#=> {"one"=>1}
what am i missing?
What you are missing is that "one" isn't in your set. one is in your set, but "one" isn't.
Therefore, the answer Ruby is giving you is perfectly correct.
All that you have done with your implementation of Rank is that any two ranks with the same name are considered to be the same by a Hash, Set, or Array#uniq. But, a Rank is not the same as a String.
If you want to be able to have a set-like data structure where you can look up things by one of their attributes, you will have to write it yourself.
Something like (untested):
class RankSet < Set
def [](*args)
super(*args.map(&:name))
end
def each
return enum_for(__callee__) unless block_given?
super {|e| yield e.name }
end
end
might get you started.
Or, instead of writing your own set, you can just use the fact that any arbitrary rank with the right name can be used for lookup:
set.include?(Rank.new(name: "one"))
#=> true
# even though it is a *different* `Rank` object

Ruby hash defaults: where do nested values go? [duplicate]

This question already has answers here:
Strange, unexpected behavior (disappearing/changing values) when using Hash default value, e.g. Hash.new([])
(4 answers)
Closed 6 years ago.
I wanted to use Ruby's default hash values to allow me to more easily nest hashes without having to manually initialize them. I thought it'd be nice to be able to dig a level down for each key safely without having pre-set the key as a hash. However, I find that when I do this, the data gets stored somewhere, but is not visible by accessing the top-level hash. Where does it go, and how does this work?
top = Hash.new({}) #=> {}
top[:first][:thing] = "hello" #=> "hello"
top[:second] = {thing: "world"} #=> {:thing => "world"}
top #=> {:second => {:thing => "world"}}
top[:first] #=> {:thing => "hello"}
You want to know where your inserted hash is? Maybe you have heard about Schroedingers cat:
h = Hash.new({})
h[:box][:cat] = "Miau"
=> "Miau"
h
=> {}
The cat seem to be dead....
h[:schroedingers][:cat]
=> "Miau"
The cat seem still to be alive, but in a different reality....
Ok, if nothing helps, "Read The Fine Manual". For Hash.new, we read:
If obj is specified, this single object will be used for all default values.
So when you write h[:box], a object is returned, and this object is another hash, and it happen to empty.
Into this empty hash, you write an key-value.
Now this other hash is no longer empty, it has a key-value pair. And it is returned every time you search for a key is not found in your original hash.
You can access the default value via a variety of #default methods
http://ruby-doc.org/core-2.2.3/Hash.html#method-i-default
top.default
=> {:thing=>"hello"}
You can also tell it how you want it to act, example:
irb(main):058:0> top = Hash.new {|h,k| h[k] = {}; h[k]}
=> {}
irb(main):059:0> top[:first][:thing] = "hello"
=> "hello"
irb(main):060:0> top[:second] = {thing: "world"}
=> {:thing=>"world"}
irb(main):061:0> top
=> {:first=>{:thing=>"hello"}, :second=>{:thing=>"world"}}

Set optional parameter put the others with a default value at nil

I face a weirb problem with optionals parameters in ruby.
This is my code :
def foo options={:test => true}
puts options[:test]
end
foo # => puts true
foo :lol => 42 # => puts nil
I can not figure out why the second call puts nil.
Is seems that putting an other parameter set :test to nil.
Thanks.
It happens because if it is a default parameter, passing a hash parameter will completely overwrite it (ie. it sets options = {:lol => 42}), so the options[:test] key no longer exists.
To give particular hash keys default values, try:
def foo options={}
options = {:test => true}.merge options
puts options[:test]
end
In this case, we merge a hash with default values for certain keys ({:test => true}), with another hash (containing the key=>values in the argument). If a key occurs in both hash objects, the value in the hash passed to the merge function will take precedence.

Idiomatic way of detecting duplicate keys in Ruby?

I've just noticed that Ruby doesn't raise an exception or even supply a warning if you supply duplicate keys to a hash:
$VERBOSE = true
key_value_pairs_with_duplicates = [[1,"a"], [1, "b"]]
# No warning produced
Hash[key_value_pairs_with_duplicates] # => {1=>"b"}
# Also no warning
hash_created_by_literal_with_duplicate_keys = {1 => "a", 1=> "b"} # => {1=>"b"}
For key_value_pairs_with_duplicates, I could detect duplicate keys by doing
keys = key_value_pairs_with_duplicates.map(&:first)
raise "Duplicate keys" unless keys.uniq == keys
Or by doing
procedurally_produced_hash = {}
key_value_pairs_with_duplicates.each do |key, value|
raise "Duplicate key" if procedurally_produced_hash.has_key?(key)
procedurally_produced_hash[key] = value
end
Or
hash = Hash[key_value_pairs_with_duplicates]
raise "Duplicate keys" unless hash.length == key_value_pairs_with_duplicates.length
But is there an idiomatic way to do it?
Hash#merge takes an optional block to define how to handle duplicate keys.
http://www.ruby-doc.org/core-1.9.3/Hash.html#method-i-merge
Taking advantage of the fact this block is only called on duplicate keys:
>> a = {a: 1, b: 2}
=> {:a=>1, :b=>2}
>> a.merge(c: 3) { |key, old, new| fail "Duplicate key: #{key}" }
=> {:a=>1, :b=>2, :c=>3}
>> a.merge(b: 10, c: 3) { |key, old, new| fail "Duplicate key: #{key}" }
RuntimeError: Duplicate key: b
I think there are two idiomatic ways to handle this:
Use one of the Hash extensions that allow multiple values per key, or
Extend Hash (or patch w/ flag method) and implement []= to throw a dupe key exception.
You could also just decorate an existing hash with the []= that throws, or alias_method--either way, it's straight-forward, and pretty Ruby-ish.
I would simply build a hash form the array, checking for a value before overwriting a key. This way it avoid creating any unnecessary temporary collections.
def make_hash(key_value_pairs_with_duplicates)
result = {}
key_value_pairs_with_duplicates.each do |pair|
key, value = pair
raise "Duplicate key" if result.has_key?(key)
result[key] = value
end
result
end
But no, I don't think there is an "idiomatic" way to doing this. It just follows the last in rule, and if you don't like that it's up to you to fix it.
In the literal form you are probably out of luck. But in the literal form why would you need to validate this? You are not getting it from a dynamic source if it's literal, so if you choose to dupe keys, it's your own fault. Just, uh... don't do that.
In other answers I've already stated my opinion that Ruby needs a standard method to build a hash from an enumerable. So, as you need your own abstraction for the task anyway, let's just take Facets' mash with the implementation you like the most (Enumerable#inject + Hash#update looks good to me) and add the check:
module Enumerable
def mash
inject({}) do |hash, item|
key, value = block_given? ? yield(item) : item
fail("Repeated key: #{key}") if hash.has_key?(key) # <- new line
hash.update(key => value)
end
end
end
I think most people here overthink the problem. To deal with duplicate keys, I'd simply do this:
arr = [ [:a,1], [:b,2], [:c,3] ]
hsh = {}
arr.each do |k,v|
raise("Whoa! I already have :#{k} key.") if hsh.has_key?(k)
x[k] = v
end
Or make a method out of this, maybe even extend a Hash class with it. Or create a child of Hash class (UniqueHash?) which would have this functionality by default.
But is it worth it? (I don't think so.) How often do we need to deal with duplicate keys in hash like this?
Latest Ruby versions do supply a warning when duplicating a key. However they still go ahead and re-assign the duplicate's value to the key, which is not always desired behaviour. IMO, the best way to deal with this is to override the construction/assignment methods. E.g. to override #[]=
class MyHash < Hash
def []=(key,val)
if self.has_key?(key)
puts("key: #{key} already has a value!")
else
super(key,val)
end
end
end
So when you run:
h = MyHash.new
h[:A] = ['red']
h[:B] = ['green']
h[:A] = ['blue']
it will output
key: A already has a value!
{:A=>["red"], :B=>["green"]}
Of course you can tailor the overridden behaviour any which way you want.
I would avoid using an array to model an hash at all. In other words, don't construct the array of pairs in the first place. I'm not being facetious or dismissive. I'm speaking as someone who has used arrays of pairs and (even worse) balanced arrays many times, and always regretted it.

Search ruby hash for empty value

I have a ruby hash like this
h = {"a" => "1", "b" => "", "c" => "2"}
Now I have a ruby function which evaluates this hash and returns true if it finds a key with an empty value. I have the following function which always returns true even if all keys in the hash are not empty
def hash_has_blank(hsh)
hsh.each do |k,v|
if v.empty?
return true
end
end
return false
end
What am I doing wrong here?
Try this:
def hash_has_blank hsh
hsh.values.any? &:empty?
end
Or:
def hash_has_blank hsh
hsh.values.any?{|i|i.empty?}
end
If you are using an old 1.8.x Ruby
I hope you're ready to learn some ruby magic here. I wouldn't define such a function globally like you did. If it's an operation on a hash, than it should be an instance method on the Hash class you can do it like this:
class Hash
def has_blank?
self.reject{|k,v| !v.nil? || v.length > 0}.size > 0
end
end
reject will return a new hash with all the empty strings, and than it will be checked how big this new hash is.
a possibly more efficient way (it shouldn't traverse the whole array):
class Hash
def has_blank?
self.values.any?{|v| v.nil? || v.length == 0}
end
end
But this will still traverse the whole hash, if there is no empty value
I've changed the empty? to !nil? || length >0 because I don't know how your empty method works.
If you just want to check if any of the values is an empty string you could do
h.has_value?('')
but your function seems to work fine.
I'd consider refactoring your model domain. Obviously the hash represents something tangible. Why not make it an object? If the item can be completely represented by a hash, you may wish to subclass Hash. If it's more complicated, the hash can be an attribute.
Secondly, the reason for which you are checking blanks can be named to better reflect your domain. You haven't told us the "why", but let's assume that your Item is only valid if it doesn't have any blank values.
class MyItem < Hash
def valid?
!invalid?
end
def invalid?
values.any?{|i| i.empty?}
end
end
The point is, if you can establish a vocabulary that makes sense in your domain, your code will be cleaner and more understandable. Using a Hash is just a means to an end and you'd be better off using more descriptive, domain-specific terms.
Using the example above, you'd be able to do:
my_item = MyItem["a" => "1", "b" => "", "c" => "2"]
my_item.valid? #=> false

Resources