How do I use the fetch method for nested hash? - ruby

I have the following hash:
hash = {'name' => { 'Mike' => { 'age' => 10, 'gender' => 'm' } } }
I can access the age by:
hash['name']['Mike']['age']
What if I used Hash#fetch method? How can I retrieve a key from a nested hash?
As Sergio mentioned, the way to do it (without creating something for myself) would be by a chain of fetch methods:
hash.fetch('name').fetch('Mike').fetch('age')

From Ruby 2.3.0 onward, you can use Hash#dig:
hash.dig('name', 'Mike', 'age')
It also comes with the added bonus that if some of the values along the way turned up to be nil, you will get nil instead of exception.
You can use the ruby_dig gem until you migrate.

EDIT: there is a built-in way now, see this answer.
There is no built-in method that I know of. I have this in my current project
class Hash
def fetch_path(*parts)
parts.reduce(self) do |memo, key|
memo[key.to_s] if memo
end
end
end
# usage
hash.fetch_path('name', 'Mike', 'age')
You can easily modify it to use #fetch instead of #[] (if you so wish).

As of Ruby 2.3.0:
You can also use &. called the "safe navigation operator" as: hash&.[]('name')&.[]('Mike')&.[]('age'). This one is perfectly safe.
Using dig is not safe as hash.dig(:name, :Mike, :age) will fail if hash is nil.
However you may combine the two as: hash&.dig(:name, :Mike, :age).
So either of the following is safe to use:
hash&.[]('name')&.[]('Mike')&.[]('age')
hash&.dig(:name, :Mike, :age)

If your goal is to raise a KeyError when any of the intermediate keys are missing, then you need to write your own method. If instead you're using fetch to provide default values for missing keys, then you can circumvent the use of fetch by constructing the Hashes with a default values.
hash = Hash.new { |h1, k1| h1[k1] = Hash.new { |h2, k2| h2[k2] = Hash.new { |h3, k3| } } }
hash['name']['Mike']
# {}
hash['name']['Steve']['age'] = 20
hash
# {"name"=>{"Mike"=>{}, "Steve"=>{"age"=>20}}}
This won't work for arbitrarily nested Hashes, you need to choose the maximum depth when you construct them.

A version that uses a method instead of adding to the Hash class for others using Ruby 2.2 or lower.
def dig(dict, *args)
key = args.shift
if args.empty?
return dict[key]
else
dig(dict[key], *args)
end
end
And so you can do:
data = { a: 1, b: {c: 2}}
dig(data, :a) == 1
dig(data, :b, :c) == 2

If you don't want to monkey patch the standard Ruby class Hash use .fetch(x, {}) variant. So for the example above will look like that:
hash.fetch('name', {}).fetch('Mike', {}).fetch('age')

The point of fetch is that an explicit error is raised at the point of contract violation instead of having to track down a silent nil running amok in the code that can lead to unpredictable state.
Although dig is elegant and useful when you expect nil to be a default, it doesn't offer the same error reporting guarantees of fetch. OP seems to want the explicit errors of fetch but without the ugly verbosity and chaining.
An example use case is receiving a plain nested hash from YAML.load_file() and requiring explicit errors for missing keys.
One option is to alias [] to fetch as shown here, but this isn't a deep operation on a nested structure.
I ultimately used a recursive function and hash.instance_eval {alias [] fetch} to apply the alias to such a plain hash deeply. A class would work just as well, with the benefit of a distinct subclass separate from Hash.
irb(main):001:1* def deeply_alias_fetch!(x)
irb(main):002:2* if x.instance_of? Hash
irb(main):003:2* x.instance_eval {alias [] fetch}
irb(main):004:2* x.each_value {|v| deeply_alias_fetch!(v)}
irb(main):005:2* elsif x.instance_of? Array
irb(main):006:2* x.each {|e| deeply_alias_fetch!(e)}
irb(main):007:1* end
irb(main):008:0> end
=> :deeply_alias_fetch!
irb(main):009:0> h = {:a => {:b => 42}, :c => [{:d => 1, :e => 2}, {}]}
irb(main):010:0> deeply_alias_fetch!(h)
=> {:a=>{:b=>42}, :c=>[{:d=>1, :e=>2}, {}]}
irb(main):011:0> h[:a][:bb]
Traceback (most recent call last):
5: from /usr/bin/irb:23:in `<main>'
4: from /usr/bin/irb:23:in `load'
3: from /usr/lib/ruby/gems/2.7.0/gems/irb-1.2.1/exe/irb:11:in `<top (required)>'
2: from (irb):11
1: from (irb):11:in `fetch'
KeyError (key not found: :bb)
Did you mean? :b
irb(main):012:0> h[:c][0][:e]
=> 2
irb(main):013:0> h[:c][0][:f]
Traceback (most recent call last):
5: from /usr/bin/irb:23:in `<main>'
4: from /usr/bin/irb:23:in `load'
3: from /usr/lib/ruby/gems/2.7.0/gems/irb-1.2.1/exe/irb:11:in `<top (required)>'
2: from (irb):14
1: from (irb):14:in `fetch'
KeyError (key not found: :f)

if you can
use:
hash[["ayy","bee"]]
instead of:
hash["ayy"]["bee"]
it'll save a lot of annoyances

Related

Ruby access propteries with dot-notation

I'm trying to build a class that will basically be used as a data structure for storing values/nested values. I want there to be two methods, get and set, that accept a dot-notated path to recursively set or get variables.
For example:
bag = ParamBag.new
bag.get('foo.bar') # => nil
bag.set('foo.bar', 'baz')
bag.get('foo.bar') # => 'baz'
The get method could also take a default return value if the value doesn't exist:
bag.get('foo.baz', false) # => false
I could also initialize a new ParamBag with a Hash.
How would I manage this in Ruby? I've done this in other languages, but in order to set a recursive path, I would take the value by reference, but I'm not sure how I'd do it in Ruby.
This was a fun exercise but still falls under the "you probably should not do this" category.
To accomplish what you want, OpenStruct can be used with some slight modifications.
class ParamBag < OpenStruct
def method_missing(name, *args, &block)
if super.nil?
modifiable[new_ostruct_member(name)] = ParamBag.new
end
end
end
This class will let you chain however many method calls together you would like and set any number of parameters.
Tested with Ruby 2.2.1
2.2.1 :023 > p = ParamBag.new
=> #<ParamBag>
2.2.1 :024 > p.foo
=> #<ParamBag>
2.2.1 :025 > p.foo.bar
=> #<ParamBag>
2.2.1 :026 > p.foo.bar = {}
=> {}
2.2.1 :027 > p.foo.bar
=> {}
2.2.1 :028 > p.foo.bar = 'abc'
=> "abc"
Basically, take your get and set methods away and call methods like you would normally.
I do not advise you actually do this, I would instead suggest you use OpenStruct by itself to acheive some flexibility without going too crazy. If you find yourself needing to chain a ton of methods and have them never fail, maybe take a step backwards and ask "is this really the right way to approach this problem?". If the answer to that question is a resounding yes, then ParamBag might just be perfect.

Ruby - Parsing a string of a Hash using YAML - Error if hash entered raw and coerced to string rather than entered as string

I have a gem I have created that wraps Git as a key:value store (dictionary/hash). The source is here.
The way it works in the process referenced is as follows:
run the function set containing a key and a value argument
hash these with git, have the key point at the hash
return the key if this operation is successful and it is added to the global dictionary holing keys and hashes
Now, if I call something like
db.set('key', {some: 'value'})
# => 'key'
and then try to retrieve this,
db.get('key')
Psych::SyntaxError: (<unknown>): did not find expected node content while parsing a flow node at line 1 column 2
from /home/bobby/.rvm/rubies/ruby-2.2.1/lib/ruby/2.2.0/psych.rb:370:in `parse'
from /home/bobby/.rvm/rubies/ruby-2.2.1/lib/ruby/2.2.0/psych.rb:370:in `parse_stream'
from /home/bobby/.rvm/rubies/ruby-2.2.1/lib/ruby/2.2.0/psych.rb:318:in `parse'
from /home/bobby/.rvm/rubies/ruby-2.2.1/lib/ruby/2.2.0/psych.rb:245:in `load'
from /home/bobby/.rvm/gems/ruby-2.2.1/gems/gkv-0.2.1/lib/gkv/database.rb:21:in `get'
from (irb):6
from /home/bobby/.rvm/rubies/ruby-2.2.1/bin/irb:11:in `<main>'
Now, if I set the key as that same dictionary, but as a string:
db.set('key', "{some: 'value'}")
# => 'key'
db.get('key')
# => {"key"=>"value"}
db.get('key').class
=> Hash
The operation that is performing the git operations' and wrapping them to a kv store source is:
...
def get(key)
if $ITEMS.keys.include? key
YAML.load(Gkv::GitFunctions.cat_file($ITEMS[key].last))
else
raise KeyError
end
end
def set(key, value)
update_items(key, value.to_s)
key
end
...
And the get_items function being referenced here's source is:
...
def update_items(key, value)
if $ITEMS.keys.include? key
history = $ITEMS[key]
history << Gkv::GitFunctions.hash_object(value.to_s)
$ITEMS[key] = history
else
$ITEMS[key] = [Gkv::GitFunctions.hash_object(value.to_s)]
end
end
end
...
hash_object and cat_object simple wrap git hash-object and git cat-file in a method writing the input to a tmpfile, git adding it, and then erasing the tempfile.
I'm really at a loss as to why this works with strings but not true dictionaries. It results in the exact same error if you use the old hashrocket syntax as well:
db.set('a', {:key => 'value'})
=> "a"
db.get('a')
# => Psych::SyntaxError: (<unknown>): did not find expected node content while parsing a flow node at line 1 column 2
from /home/bobby/.rvm/rubies/ruby-2.2.1/lib/ruby/2.2.0/psych.rb:370:in `parse'
from /home/bobby/.rvm/rubies/ruby-2.2.1/lib/ruby/2.2.0/psych.rb:370:in `parse_stream'
from /home/bobby/.rvm/rubies/ruby-2.2.1/lib/ruby/2.2.0/psych.rb:318:in `parse'
from /home/bobby/.rvm/rubies/ruby-2.2.1/lib/ruby/2.2.0/psych.rb:245:in `load'
from /home/bobby/.rvm/gems/ruby-2.2.1/gems/gkv-0.2.1/lib/gkv/database.rb:21:in `get'
from (irb):6
from /home/bobby/.rvm/rubies/ruby-2.2.1/bin/irb:11:in `<main>'
Any ideas?
In your get method you call YAML.load, but in your set method you use .to_s. This means that the YAML parser is trying to read an arbitrary string as if it were YAML. For symmetry YAML.dump should be used in the set method instead.
I've created a pull request with the changes.

chef 11: any way to turn attributes into a ruby hash?

I'm generating a config for my service in chef attributes. However, at some point, I need to turn the attribute mash into a simple ruby hash. This used to work fine in Chef 10:
node.myapp.config.to_hash
However, starting with Chef 11, this does not work. Only the top-level of the attribute is converted to a hash, with then nested values remaining immutable mash objects. Modifying them leads to errors like this:
Chef::Exceptions::ImmutableAttributeModification
------------------------------------------------ Node attributes are read-only when you do not specify which precedence level to set. To
set an attribute use code like `node.default["key"] = "value"'
I've tried a bunch of ways to get around this issue which do not work:
node.myapp.config.dup.to_hash
JSON.parse(node.myapp.config.to_json)
The json parsing hack, which seems like it should work great, results in:
JSON::ParserError
unexpected token at '"#<Chef::Node::Attribute:0x000000020eee88>"'
Is there any actual reliable way, short of including a nested parsing function in each cookbook, to convert attributes to a simple, ordinary, good old ruby hash?
after a resounding lack of answers both here and on the opscode chef mailing list, i ended up using the following hack:
class Chef
class Node
class ImmutableMash
def to_hash
h = {}
self.each do |k,v|
if v.respond_to?('to_hash')
h[k] = v.to_hash
else
h[k] = v
end
end
return h
end
end
end
end
i put this into the libraries dir in my cookbook; now i can use attribute.to_hash in both chef 10 (which already worked properly and which is unaffected by this monkey-patch) and chef 11. i've also reported this as a bug to opscode:
if you don't want to have to monkey-patch your chef, speak up on this issue:
http://tickets.opscode.com/browse/CHEF-3857
Update: monkey-patch ticket was marked closed by these PRs
I hope I am not too late to the party but merging the node object with an empty hash did it for me:
chef (12.6.0)> {}.merge(node).class
=> Hash
I had the same problem and after much hacking around came up with this:
json_string = node[:attr_tree].inspect.gsub(/\=\>/,':')
my_hash = JSON.parse(json_string, {:symbolize_names => true})
inspect does the deep parsing that is missing from the other methods proposed and I end up with a hash that I can modify and pass around as needed.
This has been fixed for a long time now:
[1] pry(main)> require 'chef/node'
=> true
[2] pry(main)> node = Chef::Node.new
[....]
[3] pry(main)> node.default["fizz"]["buzz"] = { "foo" => [ { "bar" => "baz" } ] }
=> {"foo"=>[{"bar"=>"baz"}]}
[4] pry(main)> buzz = node["fizz"]["buzz"].to_hash
=> {"foo"=>[{"bar"=>"baz"}]}
[5] pry(main)> buzz.class
=> Hash
[6] pry(main)> buzz["foo"].class
=> Array
[7] pry(main)> buzz["foo"][0].class
=> Hash
[8] pry(main)>
Probably fixed sometime in or around Chef 12.x or Chef 13.x, it is certainly no longer an issue in Chef 15.x/16.x/17.x
The above answer is a little unnecessary. You can just do this:
json = node[:whatever][:whatever].to_hash.to_json
JSON.parse(json)

Exact opposite to Ruby's CGI.parse method?

I'd like to do some sanitization of query params.
I parse the query with CGI.parse, then I delete some params, but I can't find an opposite method to build the query.
I don't really want to do something like
params.map{|n,v| "#{CGI.escape n}=#{CGI.escape v.to_s}"}.join("&")
There's got to be a simpler way. Is there?
There is a nice method in URI module:
require 'uri'
URI.encode_www_form("q" => "ruby", "lang" => "en") #=> "q=ruby&lang=en"
If you're using Rails (or don't mind pulling in ActiveSupport), then you can use to_param (AKA to_query):
{ :a => '&', :b => 'Where is pancake house?', :c => ['an', 'array'] }.to_param
# a=%26&b=Where+is+pancake+house%3F&c%5B%5D=an&c%5B%5D=array
to_param handles arrays a little differently than your version though, it'll put out c[]=an&c[]=array rather than just c=an&c=array.
While there's no better answer, I'll put up the method which I'm using now.
def build_query(params)
params.map do |name,values|
values.map do |value|
"#{CGI.escape name}=#{CGI.escape value}"
end
end.flatten.join("&")
end
I am not sure if the following is a simplification, but it avoids expanding the (key, value) pairs of a hash.
params.map{|qq| qq.map{|q| CGI.escape(q)}.join('=')}.join('&')

Ruby Autovivification

I've been trying to use autovivification in ruby to do simple record consolidation on this:
2009-08-21|09:30:01|A1|EGLE|Eagle Bulk Shpg|BUY|6000|5.03
2009-08-21|09:30:35|A2|JOYG|Joy Global Inc|BUY|4000|39.76
2009-08-21|09:30:35|A2|LEAP|Leap Wireless|BUY|2100|16.36
2009-08-21|09:30:36|A1|AINV|Apollo Inv Cp|BUY|2300|9.15
2009-08-21|09:30:36|A1|CTAS|Cintas Corp|SELL|9800|27.83
2009-08-21|09:30:38|A1|KRE|SPDR KBW Regional Banking ETF|BUY|9200|21.70
2009-08-21|09:30:39|A1|APA|APACHE CORPORATION|BUY|5700|87.18
2009-08-21|09:30:40|A1|FITB|Fifth Third Bancorp|BUY|9900|10.86
2009-08-21|09:30:40|A1|ICO|INTERNATIONAL COAL GROUP, INC.|SELL|7100|3.45
2009-08-21|09:30:41|A1|NLY|ANNALY CAPITAL MANAGEMENT. INC.|BUY|3000|17.31
2009-08-21|09:30:42|A2|GAZ|iPath Dow Jones - AIG Natural Gas Total Return Sub-Index ETN|SELL|6600|14.09
2009-08-21|09:30:44|A2|CVBF|Cvb Finl|BUY|1100|7.64
2009-08-21|09:30:44|A2|JCP|PENNEY COMPANY, INC.|BUY|300|31.05
2009-08-21|09:30:36|A1|AINV|Apollo Inv Cp|BUY|4500|9.15
so for example I want the record for A1 AINV BUY 9.15 to have a total of 6800. This is a perfect problem to use autovivification on. So heres my code:
#!/usr/bin/ruby
require 'facets'
h = Hash.autonew
File.open('trades_long.dat','r').each do |line|
#date,#time,#account,#ticker,#desc,#type,amount,#price = line.chomp.split('|')
if #account != "account"
puts "#{amount}"
h[#account][#ticker][#type][#price] += amount
end
#puts sum.to_s
end
The problem is no matter how I try to sum up the value in h[#account][#ticker][#type][#price] it gives me this error:
6000
/usr/local/lib/ruby/gems/1.9.1/gems/facets-2.7.0/lib/core/facets/hash/op_add.rb:8:in `merge': can't convert String into Hash (TypeError)
from /usr/local/lib/ruby/gems/1.9.1/gems/facets-2.7.0/lib/core/facets/hash/op_add.rb:8:in `+'
from ./trades_consolidaton.rb:13
from ./trades_consolidaton.rb:8:in `each'
from ./trades_consolidaton.rb:8
I've tried using different "autovivification" methods with no result. This wouldn't happen in perl! The autofvivification would know what you are trying to do. ruby doesn't seem to have this feature.
So my question really is, how do I perform simply "consolidation" of records in ruby. Specifically, how do I get the total for something like:
h[#account][#ticker][#type][#price]
Many thanks for your help!!
Just to clarify on glenn's solution. That would be perfect except it gives (with a few modifications to use the standard CSV library in ruby 1.9:
CSV.foreach("trades_long.dat", :col_sep => "|") do |row|
date,time,account,ticker,desc,type,amount,price = *row
records[[account,ticker,type,price]] += amount
end
gives the following error:
TypeError: String can't be coerced into Fixnum
from (irb):64:in `+'
from (irb):64:in `block in irb_binding'
from /usr/local/lib/ruby/1.9.1/csv.rb:1761:in `each'
from /usr/local/lib/ruby/1.9.1/csv.rb:1197:in `block in foreach'
from /usr/local/lib/ruby/1.9.1/csv.rb:1335:in `open'
from /usr/local/lib/ruby/1.9.1/csv.rb:1196:in `foreach'
from (irb):62
from /usr/local/bin/irb:12:in `<main>'
I agree with Jonas that you (and Sam) are making this more complicated than it needs to be, but I think even his version is too complicated. I'd just do this:
require 'fastercsv'
records = Hash.new(0)
FasterCSV.foreach("trades_long.dat", :col_sep => "|") do |row|
date,time,account,ticker,desc,type,amount,price = row.fields
records[[account,ticker,type,price]] += amount.to_f
end
Now you have a hash with total amounts for each unique combination of account, ticker, type and price.
If you want a hash builder that works that way, you are going to have to redefine the + semantics.
For example, this works fine:
class HashBuilder
def initialize
#hash = {}
end
def []=(k,v)
#hash[k] = v
end
def [](k)
#hash[k] ||= HashBuilder.new
end
def +(val)
val
end
end
h = HashBuilder.new
h[1][2][3] += 1
h[1][2][3] += 3
p h[1][2][3]
# prints 4
Essentially you are trying to apply the + operator to a Hash.
>> {} + {}
NoMethodError: undefined method `+' for {}:Hash
from (irb):1
However in facets{
>> require 'facets'
>> {1 => 10} + {2 => 20}
=> {1 => 10, 2 => 20}
>> {} + 100
TypeError: can't convert Fixnum into Hash
from /usr/lib/ruby/gems/1.8/gems/facets-2.7.0/lib/core/facets/hash/op_add.rb:8:in `merge'
from /usr/lib/ruby/gems/1.8/gems/facets-2.7.0/lib/core/facets/hash/op_add.rb:8:in `+'
from (irb):6
>> {} += {1 => 2}
=> {1=>2}
>>
If you want to redefine the + semantics for your hash in this occasion you can do:
class Hash; def +(v); v; end; end
Place this snippet before your original sample and all should be well. Keep in mind that you are changing the defined behavior for + (note + is not defined on Hash its pulled in with facets)
It looks like you are making it more complicated than it has to be. I would use the FasterCSV gem and Enumerable#inject something like this:
require 'fastercsv'
records=FasterCSV.read("trades_long.dat", :col_sep => "|")
records.sort_by {|r| r[3]}.inject(nil) {|before, curr|
if !before.nil? && curr[3]==before[3]
curr[6]=(curr[6].to_i+before[6].to_i).to_s
records.delete(before)
end
before=curr
}
For others that find their way here, there is now also another option:
require 'xkeys' # on rubygems.org
h = {}.extend XKeys::Hash
...
# Start with 0.0 (instead of nil) and add the amount
h[#account, #ticker, #type, #price, :else => 0.0] += amount.to_f
This will generate a navigable structure. (Traditional keying with arrays of [#account, #ticker, #type, #price] as suggested earlier may be better this particular application). XKeys auto-vivifies on write rather than read, so querying the structure about elements that don't exist won't change the structure.

Resources