I have a hash of hashes to display as tree, something like routes. Below, I added an example of an expected result and the result I got.
Example hash:
hash = {
'movies' => {
'action' => {
'2007' => ['video1.avi', 'x.wmv'],
'2008' => ['']
},
'comedy' => {
'2007' => [],
'2008' => ['y.avi']
}
},
'audio' => {
'rock' => {
'2003' => [],
'2004' => ['group', 'group1']
}
}
}
I expected this result:
movies
movies\action
movies\action\2007
movies\action\2007\video1.avi
movies\action\2007\x.wmv
movies\action\2008
movies\comedy\2007
movies\comedy\2008
movies\comedy\2008\y.avi
audio
audio\rock\2003
audio\rock\2004
audio\rock\2004\group
audio\rock\2004\group1
Here are some code I made:
def meth(key, val)
val.each do |key1, val1|
puts "#{key}/#{key1}"
meth(key1, val1) if val1
end
end
hash.each do |key, val|
puts key
meth(key,val)
end
It returns this result:
movies
movies/action
action/2007
2007/video1.avi
2007/x.wmv
action/2008
2008/
movies/comedy
comedy/2007
comedy/2008
2008/y.avi
audio
audio/rock
rock/2003
rock/2004
2004/group
2004/group1
Can anybody explain how to do this?
UPDATE
Thanks for answers. In this case I figured out using this code. The hint was to set key1 to the previous result.
def meth key, val
val.each do |key1, val1|
puts "#{key}/#{key1}"
key1 = "#{key}/#{key1}"
meth(key1, val1) if val1
end
end
you could change the code to:
def meth(key, val)
val.each do |key1, val1|
puts "#{key}/"
if (val1 && val1.is_a?(Hash))
meth(key1, val1)
else
puts "#{val1}"
end
end
end
you are expecting the method to work differently dependant on where it's called but that's not the case. The method does the same regardless of where it's called (e.g. if it's called by it self).
Recursion is the act of deviding one problem into smaller subproblems. There'll always be at least two. In your case the two sub problems is
- print two values
- print the key and iterate a hash
At least one of your subproblems need to end the recursion otherwise it will run forever. In the above case the first subproblem ends the recursion.
You have to keep track of the path as an array:
def meth key, val
val.each do |key1, val1|
puts key.join("/")+"/"+key1
meth(key + [key1], val1) if val1
end
end
meth [], root_of_hash
When I have a nested structure that can contain different types of classes I like to create a case statement so it is easy to define what will happen in different scenarios.
def print_tree(input, path=[])
case input
when Hash then input.flat_map{|x,y| print_tree(y, path+[x])}
when Array then input.empty? ? [path] : input.map{|x| path+[x]}
end
end
puts print_tree(my_hash).map{|z|z.join('/')}
Related
Apparently, my ability to think functional withered over time. I have problems to select a sub-dataset from a dataset. I can solve the problem the hacky imperative style, but I believe, there is a sweet functional solution, which I am unfortunately not able to find.
Consider this data structure (tried to not simplify it beyond usability):
class C
attr_reader :attrC
def initialize(base)
#attrC = { "c1" => base+10 , "c2" => base+20, "c3" => base+30}
end
end
class B
attr_reader :attrB
##counter = 0
def initialize
#attrB = Hash.new
#attrB["b#{##counter}"] = C.new(##counter)
##counter += 1
end
end
class A
attr_reader :attrA
def initialize
#attrA = { "a1" => B.new, "a2" => B.new, "a3" => B.new}
end
end
which is created as a = A.new. The complete data set then would be
#<A: #attrA={"a1"=>#<B: #attrB={"b0"=>#<C: #attrC={"c1"=>10, "c2"=>20, "c3"=>30}>}>,
"a2"=>#<B: #attrB={"b1"=>#<C: #attrC={"c1"=>11, "c2"=>21, "c3"=>31}>}>,
"a3"=>#<B: #attrB={"b2"=>#<C: #attrC={"c1"=>12, "c2"=>22, "c3"=>32}>}>}>
which is subject to a selection. I want to retrieve only those instances of B where attrB's key is "b2".
My hacky way would is:
result = Array.new
A.new.attrA.each do |_,va|
result << va.attrB.select { |kb,_| kb == "b2" }
end
p result.reject { |a| a.empty?} [0]
which results in exactly what I intended:
{"b2"=>#<C: #attrC={"c1"=>12, "c2"=>22, "c3"=>32}>}
but I believe there would be a one-liner using map, fold, zip and reduce.
If you want a one-liner:
a.attrA.values.select { |b| b.attrB.keys == %w(b2) }
This returns instances of B. In your question, you're getting attrB values rather than instances of B. If that's what you want, there's this ugly reduce:
a.attrA.values.reduce([]) { |memo, b| memo << b.attrB if b.attrB.keys == %w(b2) ; memo }
I'm not sure what you're trying to do here, though?
I want to take a hash with nested hashes and arrays and flatten it out into a single hash with unique values. I keep trying to approach this from different angles, but then I make it way more complex than it needs to be and get myself lost in what's happening.
Example Source Hash:
{
"Name" => "Kim Kones",
"License Number" => "54321",
"Details" => {
"Name" => "Kones, Kim",
"Licenses" => [
{
"License Type" => "PT",
"License Number" => "54321"
},
{
"License Type" => "Temp",
"License Number" => "T123"
},
{
"License Type" => "AP",
"License Number" => "A666",
"Expiration Date" => "12/31/2020"
}
]
}
}
Example Desired Hash:
{
"Name" => "Kim Kones",
"License Number" => "54321",
"Details_Name" => "Kones, Kim",
"Details_Licenses_1_License Type" => "PT",
"Details_Licenses_1_License Number" => "54321",
"Details_Licenses_2_License Type" => "Temp",
"Details_Licenses_2_License Number" => "T123",
"Details_Licenses_3_License Type" => "AP",
"Details_Licenses_3_License Number" => "A666",
"Details_Licenses_3_Expiration Date" => "12/31/2020"
}
For what it's worth, here's my most recent attempt before giving up.
def flattify(hashy)
temp = {}
hashy.each do |key, val|
if val.is_a? String
temp["#{key}"] = val
elsif val.is_a? Hash
temp.merge(rename val, key, "")
elsif val.is_a? Array
temp["#{key}"] = enumerate val, key
else
end
print "=> #{temp}\n"
end
return temp
end
def rename (hashy, str, n)
temp = {}
hashy.each do |key, val|
if val.is_a? String
temp["#{key}#{n}"] = val
elsif val.is_a? Hash
val.each do |k, v|
temp["#{key}_#{k}#{n}"] = v
end
elsif val.is_a? Array
temp["#{key}"] = enumerate val, key
else
end
end
return flattify temp
end
def enumerate (ary, str)
temp = {}
i = 1
ary.each do |x|
temp["#{str}#{i}"] = x
i += 1
end
return flattify temp
end
Interesting question!
Theory
Here's a recursive method to parse your data.
It keeps track of which keys and indices it has found.
It appends them in a tmp array.
Once a leaf object has been found, it gets written in a hash as value, with a joined tmp as key.
This small hash then gets recursively merged back to the main hash.
Code
def recursive_parsing(object, tmp = [])
case object
when Array
object.each.with_index(1).with_object({}) do |(element, i), result|
result.merge! recursive_parsing(element, tmp + [i])
end
when Hash
object.each_with_object({}) do |(key, value), result|
result.merge! recursive_parsing(value, tmp + [key])
end
else
{ tmp.join('_') => object }
end
end
As an example:
require 'pp'
pp recursive_parsing(data)
# {"Name"=>"Kim Kones",
# "License Number"=>"54321",
# "Details_Name"=>"Kones, Kim",
# "Details_Licenses_1_License Type"=>"PT",
# "Details_Licenses_1_License Number"=>"54321",
# "Details_Licenses_2_License Type"=>"Temp",
# "Details_Licenses_2_License Number"=>"T123",
# "Details_Licenses_3_License Type"=>"AP",
# "Details_Licenses_3_License Number"=>"A666",
# "Details_Licenses_3_Expiration Date"=>"12/31/2020"}
Debugging
Here's a modified version with old-school debugging. It might help you understand what's going on:
def recursive_parsing(object, tmp = [], indent="")
puts "#{indent}Parsing #{object.inspect}, with tmp=#{tmp.inspect}"
result = case object
when Array
puts "#{indent} It's an array! Let's parse every element:"
object.each_with_object({}).with_index(1) do |(element, result), i|
result.merge! recursive_parsing(element, tmp + [i], indent + " ")
end
when Hash
puts "#{indent} It's a hash! Let's parse every key,value pair:"
object.each_with_object({}) do |(key, value), result|
result.merge! recursive_parsing(value, tmp + [key], indent + " ")
end
else
puts "#{indent} It's a leaf! Let's return a hash"
{ tmp.join('_') => object }
end
puts "#{indent} Returning #{result.inspect}\n"
result
end
When called with recursive_parsing([{a: 'foo', b: 'bar'}, {c: 'baz'}]), it displays:
Parsing [{:a=>"foo", :b=>"bar"}, {:c=>"baz"}], with tmp=[]
It's an array! Let's parse every element:
Parsing {:a=>"foo", :b=>"bar"}, with tmp=[1]
It's a hash! Let's parse every key,value pair:
Parsing "foo", with tmp=[1, :a]
It's a leaf! Let's return a hash
Returning {"1_a"=>"foo"}
Parsing "bar", with tmp=[1, :b]
It's a leaf! Let's return a hash
Returning {"1_b"=>"bar"}
Returning {"1_a"=>"foo", "1_b"=>"bar"}
Parsing {:c=>"baz"}, with tmp=[2]
It's a hash! Let's parse every key,value pair:
Parsing "baz", with tmp=[2, :c]
It's a leaf! Let's return a hash
Returning {"2_c"=>"baz"}
Returning {"2_c"=>"baz"}
Returning {"1_a"=>"foo", "1_b"=>"bar", "2_c"=>"baz"}
Unlike the others, I have no love for each_with_object :-). But I do like passing a single result hash around so I don't have to merge and remerge hashes over and over again.
def flattify(value, result = {}, path = [])
case value
when Array
value.each.with_index(1) do |v, i|
flattify(v, result, path + [i])
end
when Hash
value.each do |k, v|
flattify(v, result, path + [k])
end
else
result[path.join("_")] = value
end
result
end
(Some details adopted from Eric, see comments)
Non-recursive approach, using BFS with an array as a queue. I keep the key-value pairs where the value isn't an array/hash, and push array/hash contents to the queue (with combined keys). Turning arrays into hashes (["a", "b"] ↦ {1=>"a", 2=>"b"}) as that felt neat.
def flattify(hash)
(q = hash.to_a).select { |key, value|
value = (1..value.size).zip(value).to_h if value.is_a? Array
!value.is_a?(Hash) || !value.each { |k, v| q << ["#{key}_#{k}", v] }
}.to_h
end
One thing I like about it is the nice combination of keys as "#{key}_#{k}". In my other solution, I could've also used a string path = '' and extended that with path + "_" + k, but that would've caused a leading underscore that I'd have to avoid or trim with extra code.
Write a function that accepts a multi-dimensional container of any size and converts it into a one dimensional associative array whose keys are strings representing their value's path in the original container.
So { 'one' => {'two' => 3, 'four' => [ 5,6,7]}, 'eight'=> {'nine'=> {'ten'=>11}}}
would become
:
"{'one/two' => 3,'one/four/0' => 5, 'one/four/1' => 6, 'one/four/2' => 7, 'eight/nine/ten' : 11}"
I've gotten this so far... But am having a lot of issues. Any pointers to things I am overlooking?
def oneDimHash(hash)
if hash.is_a?(Fixnum)
puts "AHHH"
else
hash.each_pair do |key,value|
if value.is_a?(Hash)
#temp_key << key << '/'
oneDimHash(value)
elsif value.is_a?(Array)
value.each_with_index do |val,index|
puts index
#temp_key << "#{index}"
oneDimHash(val)
end
else
#temp_key << key
#result["#{#temp_key}"] = "#{value}"
#temp_key = ''
end
end
end
end
It's immediately suspect to me that you are using instance variables instead of method arguments / local variables. Very likely that is producing messed-up keys, at least. Supposing that the method signature cannot be modified, you can work around the need for additional arguments by delegating to a helper function. Perhaps I'd try an approach along these lines:
def oneDimHash(o)
oneDimHashInternal("", o, {})
end
def oneDimHashInternal(keyStem, o, hash)
if o.is_a? Hash
o.each_pair do |key, value|
oneDimHashInternal("#{keystem}/#{key}", value, hash)
end
elsif o.is_a? Array
# Work this out for yourself
else
# Store the (non-container) object in hash
# Work this out for yourself
end
hash
end
Note also that there are Enumerables that are neither Arrays nor Hashes. I don't know whether you need to account for such.
How about this?
def oneDimHash(obj,parent="")
unless obj.is_a?(Hash)
puts "AHHH" # or may be better: raise "AHHH"
else
obj.flat_map do |key,value|
combined_key = [parent,key.to_s].join '/'
case value
when Hash then oneDimHash(value,combined_key).to_a
when Array then value.each_with_index.map { |v,i| [combined_key+"/#{i}",v] }
else [ [combined_key,value] ]
end
end.to_h
end
end
I have a terribly nested Json response.
[[{:test=>[{:id=>1, :b=>{id: '2'}}]}]]
There's more arrays than that but you get the idea.
Is there a way to recursively search through and find all the items that have a key I need?
I tried using this function extract_list() but it doesn't handle arrays well.
def nested_find(obj, needed_keys)
return {} unless obj.is_a?(Array) || obj.is_a?(Hash)
obj.inject({}) do |hash, val|
if val.is_a?(Hash) && (tmp = needed_keys & val.keys).length > 0
tmp.each { |key| hash[key] = val[key] }
elsif val.is_a?(Array)
hash.merge!(obj.map { |v| nested_find(v, needed_keys) }.reduce(:merge))
end
hash
end
end
Example
needed_keys = [:id, :another_key]
nested_find([ ['test', [{id:1}], [[another_key: 5]]]], needed_keys)
# {:id=>1, :another_key=>5}
The following is not what I'd suggest, but just to give a brief alternative to the other solutions provided:
2.1.1 :001 > obj = [[{:test=>[{:id=>1, :b=>{id: '2'}}]}]]
=> [[{:test=>[{:id=>1, :b=>{:id=>"2"}}]}]]
2.1.1 :002 > key = :id
=> :id
2.1.1 :003 > obj.inspect.scan(/#{key.inspect}=>([^,}]*)[,}]/).flatten.map {|s| eval s}
=> [1, "2"]
Note: use of eval here is just for an example. It would fail/produce incorrect results on anything whose inspect value was not eval-able back to the same instance, and it can execute malicious code:
You'll need to write your own recursive handler. Assuming that you've already converted your JSON to a Ruby data structure (via JSON.load or whatnot):
def deep_find_value_with_key(data, desired_key)
case data
when Array
data.each do |value|
if found = deep_find_value_with_key value, desired_key
return found
end
end
when Hash
if data.key?(desired_key)
data[desired_key]
else
data.each do |key, val|
if found = deep_find_value_with_key(val, desired_key)
return found
end
end
end
end
return nil
end
The general idea is that given a data structure, you check it for the key (if it's a hash) and return the matching value if found. Otherwise, you iterate it (if it's an Array or Hash) and perform the same check on each of it's children.
This will find the value for the first occurrence of the given key, or nil if the key doesn't exist in the tree. If you need to find all instances then it's slightly different - you basically need to pass an array that will accumulate the values:
def deep_find_value_with_key(data, desired_key, hits = [])
case data
when Array
data.each do |value|
deep_find_value_with_key value, desired_key, hits
end
when Hash
if data.key?(desired_key)
hits << data[desired_key]
else
data.each do |key, val|
deep_find_value_with_key(val, desired_key)
end
end
end
return hits
end
I need to create a signature string for a variable in Ruby, where the variable can be a number, a string, a hash, or an array. The hash values and array elements can also be any of these types.
This string will be used to compare the values in a database (Mongo, in this case).
My first thought was to create an MD5 hash of a JSON encoded value, like so: (body is the variable referred to above)
def createsig(body)
Digest::MD5.hexdigest(JSON.generate(body))
end
This nearly works, but JSON.generate does not encode the keys of a hash in the same order each time, so createsig({:a=>'a',:b=>'b'}) does not always equal createsig({:b=>'b',:a=>'a'}).
What is the best way to create a signature string to fit this need?
Note: For the detail oriented among us, I know that you can't JSON.generate() a number or a string. In these cases, I would just call MD5.hexdigest() directly.
I coding up the following pretty quickly and don't have time to really test it here at work, but it ought to do the job. Let me know if you find any issues with it and I'll take a look.
This should properly flatten out and sort the arrays and hashes, and you'd need to have to some pretty strange looking strings for there to be any collisions.
def createsig(body)
Digest::MD5.hexdigest( sigflat body )
end
def sigflat(body)
if body.class == Hash
arr = []
body.each do |key, value|
arr << "#{sigflat key}=>#{sigflat value}"
end
body = arr
end
if body.class == Array
str = ''
body.map! do |value|
sigflat value
end.sort!.each do |value|
str << value
end
end
if body.class != String
body = body.to_s << body.class.to_s
end
body
end
> sigflat({:a => {:b => 'b', :c => 'c'}, :d => 'd'}) == sigflat({:d => 'd', :a => {:c => 'c', :b => 'b'}})
=> true
If you could only get a string representation of body and not have the Ruby 1.8 hash come back with different orders from one time to the other, you could reliably hash that string representation. Let's get our hands dirty with some monkey patches:
require 'digest/md5'
class Object
def md5key
to_s
end
end
class Array
def md5key
map(&:md5key).join
end
end
class Hash
def md5key
sort.map(&:md5key).join
end
end
Now any object (of the types mentioned in the question) respond to md5key by returning a reliable key to use for creating a checksum, so:
def createsig(o)
Digest::MD5.hexdigest(o.md5key)
end
Example:
body = [
{
'bar' => [
345,
"baz",
],
'qux' => 7,
},
"foo",
123,
]
p body.md5key # => "bar345bazqux7foo123"
p createsig(body) # => "3a92036374de88118faf19483fe2572e"
Note: This hash representation does not encode the structure, only the concatenation of the values. Therefore ["a", "b", "c"] will hash the same as ["abc"].
Here's my solution. I walk the data structure and build up a list of pieces that get joined into a single string. In order to ensure that the class types seen affect the hash, I inject a single unicode character that encodes basic type information along the way. (For example, we want ["1", "2", "3"].objsum != [1,2,3].objsum)
I did this as a refinement on Object, it's easily ported to a monkey patch. To use it just require the file and run "using ObjSum".
module ObjSum
refine Object do
def objsum
parts = []
queue = [self]
while queue.size > 0
item = queue.shift
if item.kind_of?(Hash)
parts << "\\000"
item.keys.sort.each do |k|
queue << k
queue << item[k]
end
elsif item.kind_of?(Set)
parts << "\\001"
item.to_a.sort.each { |i| queue << i }
elsif item.kind_of?(Enumerable)
parts << "\\002"
item.each { |i| queue << i }
elsif item.kind_of?(Fixnum)
parts << "\\003"
parts << item.to_s
elsif item.kind_of?(Float)
parts << "\\004"
parts << item.to_s
else
parts << item.to_s
end
end
Digest::MD5.hexdigest(parts.join)
end
end
end
Just my 2 cents:
module Ext
module Hash
module InstanceMethods
# Return a string suitable for generating content signature.
# Signature image does not depend on order of keys.
#
# {:a => 1, :b => 2}.signature_image == {:b => 2, :a => 1}.signature_image # => true
# {{:a => 1, :b => 2} => 3}.signature_image == {{:b => 2, :a => 1} => 3}.signature_image # => true
# etc.
#
# NOTE: Signature images of identical content generated under different versions of Ruby are NOT GUARANTEED to be identical.
def signature_image
# Store normalized key-value pairs here.
ar = []
each do |k, v|
ar << [
k.is_a?(::Hash) ? k.signature_image : [k.class.to_s, k.inspect].join(":"),
v.is_a?(::Hash) ? v.signature_image : [v.class.to_s, v.inspect].join(":"),
]
end
ar.sort.inspect
end
end
end
end
class Hash #:nodoc:
include Ext::Hash::InstanceMethods
end
These days there is a formally defined method for canonicalizing JSON, for exactly this reason: https://datatracker.ietf.org/doc/html/draft-rundgren-json-canonicalization-scheme-16
There is a ruby implementation here: https://github.com/dryruby/json-canonicalization
Depending on your needs, you could call ary.inspect or ary.to_yaml, even.