What is this Hash-like/Tree-like Construct Called? - ruby

I want to create a "Config" class that acts somewhere between a hash and a tree. It's just for storing global values, which can have a context.
Here's how I use it:
Config.get("root.parent.child_b") #=> "value"
Here's what the class might look like:
class Construct
def get(path)
# split path by "."
# search tree for nodes
end
def set(key, value)
# split path by "."
# create tree node if necessary
# set tree value
end
def tree
{
:root => {
:parent => {
:child_a => "value",
:child_b => "another value"
},
:another_parent => {
:something => {
:nesting => "goes on and on"
}
}
}
}
end
end
Is there a name for this kind of thing, somewhere between Hash and Tree (not a Computer Science major)? Basically a hash-like interface to a tree.
Something that outputs like this:
t = TreeHash.new
t.set("root.parent.child_a", "value")
t.set("root.parent.child_b", "another value")
desired output format:
t.get("root.parent.child_a") #=> "value"
t.get("root") #=> {"parent" => {"child_a" => "value", "child_b" => "another value"}}
instead of this:
t.get("root") #=> nil
or this (which you get the value from by calling {}.value)
t.get("root") #=> {"parent" => {"child_a" => {}, "child_b" => {}}}

You can implement one in no-time:
class TreeHash < Hash
attr_accessor :value
def initialize
block = Proc.new {|h,k| h[k] = TreeHash.new(&block)}
super &block
end
def get(path)
find_node(path).value
end
def set(path, value)
find_node(path).value = value
end
private
def find_node(path)
path.split('.').inject(self){|h,k| h[k]}
end
end
You could improve implementation by setting unneeded Hash methods as a private ones, but it already works the way you wanted it. Data is stored in hash, so you can easily convert it to yaml.
EDIT:
To meet further expectations (and, convert to_yaml by default properly) you should use modified version:
class TreeHash < Hash
def initialize
block = Proc.new {|h,k| h[k] = TreeHash.new(&block)}
super &block
end
def get(path)
path.split('.').inject(self){|h,k| h[k]}
end
def set(path, value)
path = path.split('.')
leaf = path.pop
path.inject(self){|h,k| h[k]}[leaf] = value
end
end
This version is slight trade-off, as you cannot store values in non-leaf nodes.

I think the name for the structure is really a nested hash, and the code in the question is a reinvention of javascript's dictionaries. Since a dictionary in JS (or Python or ...) can be nested, each value can be another dictionary, which has its own key/val pairs. In javascript, that's all an object is.
And the best bit is being able to use JSON to define it neatly, and pass it around:
tree : {
'root' : {
'parent' : {
'child_a' : "value",
'child_b' : "another value"
},
'another_parent' : {
'something' : {
'nesting' : "goes on and on"
}
}
}
};
In JS you can then do tree.root.parent.child_a.
This answer to another question suggests using the Hashie gem to convert JSON objects into Ruby objects.

I think this resembles a TreeMap data structure similar to the one in Java described here. It does the same thing (key/value mappings) but retrieval might be different since you are using the nodes themselves as the keys. Retrieval from the TreeMap described is abstracted from the implementation since, when you pass in a key, you don't know the exact location of it in the tree.
Hope that makes sense!

Er... it can certainly be done, using a hierarchical hash table, but why do you need the hierarchy? IF you only need exactly-matching get and put, why can't you just make a single hash table that happens to use a dot-separated naming convention?
That's all that's needed to implement the functionality you've asked for, and it's obviously very simple...

Why use a hash-like interface at all? Why not use chaining of methods to navigate your tree? For example config.root.parent.child_b and use instance methods and if needed method_missing() to implement them?

Related

How to optimize extracting data from nested hashes in ruby?

Background
I have a collection of nested hashes which present a set of parameters to define application behavior:
custom_demo_options: {
verticals: {
fashion: true,
automotive: false,
fsi: false
},
channels: {
b2b: true,
b2c: true
}
}
website_data: {
verticals: {
fashion: {
b2b: {
code: 'luma_b2b',
url: 'b2b.luma.com'
},
b2c: {
code: 'base',
url: 'luma.com'
}
}
}
}
The choices made in the custom_demo_options hash relate to data stored in the website_data hash and serve to return values from it:
data = []
collection = {}
custom_demo_options[:verticlas].each do |vertical_name, vertical_choice|
# Get each vertical selection
if vertical_choice == true
# Loop through the channels for each selected vertical
custom_demo_options[:channels].each do |channel_name, channel_choice|
# Get each channel selection for each vertical selection
if channel_choice == true
# Loop through the website data for each vertical/channel selection
website_data[:verticals].each do |site_vertical, vertical_data|
# Look at the keys of the [:website_data][:verticals] hash
# If we have a vertical selection that matches a website_data vertical...
if site_vertical == vertical_name
# For each website_data vertical collection...
vertical_data.each do |vertical_channel, channel_value|
# If we have a matching channel in the collection...
if vertical_channel == channel_name
# Add the channel's url and code to the collection hash
collection[:url] = channel_value[:url]
collection[:code] = channel_value[:code]
# Push the collection hash(es) onto the data array
data.push(collection)
}
}
}
}
}
}
}
}
The data pushed to the data array is ultimately used to create the following nginx map definition:
map $http_host $MAGE_RUN_CODE {
luma.com base;
b2b.luma.com luma_b2b;
}
As an example of the relationship between the hashes, if a user sets custom_demo_options[:channels][:b2b] tofalse, the b2b code/url pair stored in thewebsite_data` hash would be removed from the nginx block:
map $http_host $MAGE_RUN_CODE {
luma.com base;
}
Question
The above code works, but I know it's horribly inefficient. I'm relatively new to ruby, but I think this is most likely a logical challenge rather than a language-specific one.
My question is, what is the proper way to connect these hashes rather than using loops as I've done? I've done some reading on hash.select and it seems like this might be the best route, but I'd like to know: are there are other approaches I should consider that would optimize this operation?
UPDATE
I've been able to implement the first suggestion (thanks again to the poster); however, I think the second solution will be a better approach. Everything works as described; however, my data structure has changed slightly, and although I understand what the solution is doing, I'm having trouble adapting accordingly. Here's the new structure:
custom_demo_options = {
verticals: {
fashion: true,
automotive: false,
fsi: false
},
channels: {
b2b: true,
b2c: true
},
geos: [
'us_en'
]
}
website_data = {
verticals: {
fashion: {
us_en: {
b2b: {
code: 'luma_b2b',
url: 'b2b.luma.com'
},
b2c: {
code: 'base',
url: 'luma.com'
}
}
}
}
}
So, I add another level to the hashes, :geo.
I've tried to adapt the second solution has follows:
class CustomOptionsMap
attr_accessor :custom_options, :website_data
def initialize(custom_options, website_data)
#custom_options = custom_options
#website_data = website_data[:verticals]
end
def data
verticals = selected_verticals
channels = selected_channels
geos = selected_geos
# I know this is the piece I'm not understanding. How to map channels and geos accordingly.
verticals.map{ |vertical| #website_data.fetch(vertical).slice(*channels) }
end
private
def selected_geos
#custom_options[:geos].select{|_,v| v } # I think this is correct, as it extracts the geo from the array and we don't have additional keys
end
def selected_verticals
#custom_options[:verticals].select{|_,v| v }.keys
end
def selected_channels
#custom_options[:channels].select{|_,v| v }.keys
end
end
demo_configuration = CustomOptionsMap.new(custom_demo_options, website_data)
print demo_configuration.data
Any guidance on what I'm missing regarding the map statement would be very much appreciated.
Object Oriented approach.
Using OOP might be more readable and consistent in this context, as Ruby is Object Oriented language.
Introducing simple Ruby class and using activesupport module, which is extending Hash with some useful methods, same result can be achieved in the following way:
class WebsiteConifg
attr_accessor :custom_options, :website_data
def initialize(custom_options, website_data)
#custom_options = custom_options
#website_data = website_data[:verticals]
end
def data
verticals = selected_verticals
channels = selected_channels
verticals.map{ |vertical| #website_data.fetch(vertical).slice(*channels) }
end
private
def selected_verticals
#custom_options[:verticals].select{|_,v| v }.keys
end
def selected_channels
#custom_options[:channels].select{|_,v| v }.keys
end
Based on passed custom_demo_options we can select verticals and channels of only those keys, which values are set as true.
For your configuration will return
selected_verticals # [:fashion]
selected_channels # [:b2b, :b2c]
+data()
Simple public interface is iterating through all selected verticals based on the passed options and return Array of hashes for the given channels by using slice(keys).
fetch(key)
return value for the given key it is an equivalent of h[:key]
h = {a: 2, b: 3}
h.fetch(:a) # 2
h.fetch(:b) # 3
slice(key1, key2) does require activesupport
returns hash which contains passed as an arguments, keys. Method is accepting multiple arguments, as in our example we are getting an Array of those keys, we can use * splat operator to comply with this interface.
h = {a: 2, b: 3}
h.slice(:a) # {:a=>2}
h.slice(:a, :b) # {:a=>2, :b=>3}
h.slice(*[:a, :b]) # {:a=>2, :b=>3}
Usage
website_config = WebsiteConifg.new(custom_demo_options, website_data)
website_config.data
# returns
# [{:b2b=>{:code=>"luma_b2b", :url=>"b2b.luma.com"}, :b2c=>{:code=>"base", :url=>"luma.com"}}]
UPDATE
Changed relevant parts:
def data
verticals = selected_verticals
channels = selected_channels
geos = selected_geos
verticals.map do |vertical|
verticals_data = #website_data.fetch(vertical)
# in case of multiple geolocations
# collecting relevant entries of all of them
geos_data = geos.map{|geo| verticals_data.fetch(geo) }
# for each geo-location getting selected channels
geos_data.map {|geo_data| geo_data.slice(*channels) }
end.flatten
end
private
# as `website_data' hash is using symbols, we need to covert string->sym
def selected_geos
#custom_options[:geos].map(&:to_sym)
end
def selected_verticals
selected_for(:verticals).keys
end
def selected_channels
selected_for(:channels).keys
end
def selected_for(key)
#custom_options[key].select{|_,v| v }
end
Easiest way to understand what kind of output(data) you have on each of the steps in the each(map) iterator, would be to place there debugger
like: pry, byebug.
Say you have key = :foo and hash = { foo: 1, bar: 2 } - you want to know the hash's value for that key.
The approach you're using here is essentially
result = nil
hsh.each { |k,v| result = v if k == :foo }
But why do that when you can simply say
result = hsh[:foo]
It seems like you understand how hashes can be iterable structures, and you can traverse them like arrays. But you're overdoing it, and forgetting that hashes are indexed structures. In terms of your code I would refactor it like so:
# fixed typo here: verticlas => verticals
custom_demo_options[:verticals].each do |vertical_name, vertical_choice|
# == true is almost always unnecessary, just use a truthiness check
next unless vertical_choice
custom_demo_options[:channels].each do |channel_name, channel_choice|
next unless channel_choice
vertical_data = website_data[:verticals][site_vertical]
channel_value = vertical_data[channel_name]
# This must be initialized here:
collection = {}
collection[:url] = channel_value[:url]
collection[:code] = channel_value[:code]
data.push(collection)
end
end
You can see that a lot of the nesting and complexity is removed. Note that I am initializing collection at the time it has attributes added to it. This is a little too much to go into here but I highly advise reading up on mutability in Ruby. You're current code will probably not do what you expect it to because you're pushing the same collection hash into the array multiple times.
At this point, you could refactor it into a more functional-programming style, with some chained methods, but I'd leave that exercise for you

Does Ruby have a `Pair` data type?

Sometimes I need to deal with key / value data.
I dislike using Arrays, because they are not constrained in size (it's too easy to accidentally add more than 2 items, plus you end up needing to validate size later on). Furthermore, indexes of 0 and 1 become magic numbers and do a poor job of conveying meaning ("When I say 0, I really mean head...").
Hashes are also not appropriate, as it is possible to accidentally add an extra entry.
I wrote the following class to solve the problem:
class Pair
attr_accessor :head, :tail
def initialize(h, t)
#head, #tail = h, t
end
end
It works great and solves the problem, but I am curious to know: does the Ruby standard library comes with such a class already?
No, Ruby doesn't have a standard Pair class.
You could take a look at "Using Tuples in Ruby?".
The solutions involve either using a similar class as yours, the Tuples gem or OpenStruct.
Python has tuple, but even Java doesn't have one: "A Java collection of value pairs? (tuples?)".
You can also use OpenStruct datatype. Probably not exactly what you wanted, but here is an implementation ...
require 'ostruct'
foo = OpenStruct.new
foo.head = "cabeza"
foo.tail = "cola"
Finally,
puts foo.head
=> "cabeza"
puts foo.tail
=> "cola"
No, there is no such class in the Ruby core library or standard libraries. It would be nice to have core library support (as well as literal syntax) for tuples, though.
I once experimented with a class very similar to yours, in order to replace the array that gets yielded by Hash#each with a pair. I found that monkey-patching Hash#each to return a pair instead of an array actually breaks surprisingly little code, provided that the pair class responds appropriately to to_a and to_ary:
class Pair
attr_reader :first, :second
def to_ary; [first, second] end
alias_method :to_a, :to_ary
private
attr_writer :first, :second
def initialize(first, second)
self.first, self.second = first, second
end
class << self; alias_method :[], :new end
end
You don't need a special type, you can use a 2-element array with a little helper to give the pairs a consistent order. E.g.:
def pair(a, b)
(a.hash < b.hash) ? [a, b] : [b, a]
end
distances = {
pair("Los Angeles", "New York") => 2_789.6,
pair("Los Angeles", "Sydney") => 7_497,
}
distances[ pair("Los Angeles", "New York") ] # => 2789.6
distances[ pair("New York", "Los Angeles") ] # => 2789.6
distances[ pair("Sydney", "Los Angeles") ] # => 7497

How do I access the elements in a hash which is itself a value in a hash?

I have this hash $chicken_parts, which consists of symbol/hash pairs (many more than shown here):
$chicken_parts = { :beak = > {"name"=>"Beak", "color"=>"Yellowish orange", "function"=>"Pecking"}, :claws => {"name"=>"Claws", "color"=>"Dirty", function"=>"Scratching"} }
Then I have a class Embryo which has two class-specific hashes:
class Embryo
#parts_grown = Hash.new
#currently_developing = Hash.new
Over time, new pairs from $chicken_parts will be .merge!ed into #parts_grown. At various times, #currently developing will be declared equal to one of the symbol/hash pairs from #parts_grown.
I'm creating Embryo class functions and I want to be able to access the "name", "color", and "function" values in #currently_developing, but I don't seem to be able to do it.
def grow_part(part)
#parts_grown.merge!($chicken_parts[part])
end
def develop_part(part)
#currently_developing = #parts_grown[part]
seems to populate the hashes as expected, but
puts #currently_developing["name"]
does not work. Is this whole scheme a bad idea? Should I just make the Embryo hashes into arrays of symbols from $chicken_parts, and refer to it whenever needed? That seemed like cheating to me for some reason...
There's a little bit of confusion here. When you merge! in grow_part, you aren't adding a :beak => {etc...} pair to #parts_grown. Rather, you are merging the hash that is pointed too by the part name, and adding all of the fields of that hash directly to #parts_grown. So after one grow_part, #parts_grown might look like this:
{"name"=>"Beak", "color"=>"Yellowish orange", "function"=>"Pecking"}
I don't think that's what you want. Instead, try this for grow_part:
def grow_part(part)
#parts_grown[part] = $chicken_parts[part]
end
class Embryo
#parts_grown = {a: 1, b: 2}
def show
p #parts_grown
end
def self.show
p #parts_grown
end
end
embryo = Embryo.new
embryo.show
Embryo.show
--output:--
nil
{:a=>1, :b=>2}

Issues iterating over a hash in Ruby

What I'd like to do is pass in a hash of hashes that looks something like this:
input = {
"configVersion" => "someVers",
"box" =>
{
"primary" => {
"ip" => "192.168.1.1",
"host" => "something"
},
"api" => {
"live" => "livekey",
"test" => "testkey"
}
}
}
then iterate over it, continuing if the value is another hash, and generating output with it. The result should be something like this:
configVersion = "someVers"
box.primary.ip = "192.168.1.1"
box.primary.host = "something"
and so on...
I know how to crawl through and continue if the value is a hash, but I'm unsure how to concatenate the whole thing together and pass the value back up. Here is my code:
def crawl(input)
input.each do |k,v|
case v
when Hash
out < "#{k}."
crawl(v)
else
out < " = '#{v}';"
end
end
end
My problem is: where to define out and how to return it all back. I'm very new to Ruby.
You can pass strings between multiple calls of the recursive method and use them like accumulators.
This method uses an ancestors string to build up your dot-notation string of keys, and an output str that collects the output and returns it at the end of the method. The str is passed through every call; the chain variable is a modified version of the ancestor string that changes from call to call:
def hash_to_string(hash, ancestors = "", str = "")
hash.each do |key, value|
chain = ancestors.empty? ? key : "#{ancestors}.#{key}"
if value.is_a? Hash
hash_to_string(value, chain, str)
else
str << "#{chain} = \"#{value}\"\n"
end
end
str
end
hash_to_string input
(This assumes you want your output to be a string formatted as you've shown above)
This blog post has a decent solution for the recursion and offers a slightly better alternative using the method_missing method available in Ruby.
In general, your recursion is correct, you just want to be doing something different instead of concatenating the output to out.

Search ruby hash for empty value

I have a ruby hash like this
h = {"a" => "1", "b" => "", "c" => "2"}
Now I have a ruby function which evaluates this hash and returns true if it finds a key with an empty value. I have the following function which always returns true even if all keys in the hash are not empty
def hash_has_blank(hsh)
hsh.each do |k,v|
if v.empty?
return true
end
end
return false
end
What am I doing wrong here?
Try this:
def hash_has_blank hsh
hsh.values.any? &:empty?
end
Or:
def hash_has_blank hsh
hsh.values.any?{|i|i.empty?}
end
If you are using an old 1.8.x Ruby
I hope you're ready to learn some ruby magic here. I wouldn't define such a function globally like you did. If it's an operation on a hash, than it should be an instance method on the Hash class you can do it like this:
class Hash
def has_blank?
self.reject{|k,v| !v.nil? || v.length > 0}.size > 0
end
end
reject will return a new hash with all the empty strings, and than it will be checked how big this new hash is.
a possibly more efficient way (it shouldn't traverse the whole array):
class Hash
def has_blank?
self.values.any?{|v| v.nil? || v.length == 0}
end
end
But this will still traverse the whole hash, if there is no empty value
I've changed the empty? to !nil? || length >0 because I don't know how your empty method works.
If you just want to check if any of the values is an empty string you could do
h.has_value?('')
but your function seems to work fine.
I'd consider refactoring your model domain. Obviously the hash represents something tangible. Why not make it an object? If the item can be completely represented by a hash, you may wish to subclass Hash. If it's more complicated, the hash can be an attribute.
Secondly, the reason for which you are checking blanks can be named to better reflect your domain. You haven't told us the "why", but let's assume that your Item is only valid if it doesn't have any blank values.
class MyItem < Hash
def valid?
!invalid?
end
def invalid?
values.any?{|i| i.empty?}
end
end
The point is, if you can establish a vocabulary that makes sense in your domain, your code will be cleaner and more understandable. Using a Hash is just a means to an end and you'd be better off using more descriptive, domain-specific terms.
Using the example above, you'd be able to do:
my_item = MyItem["a" => "1", "b" => "", "c" => "2"]
my_item.valid? #=> false

Resources