Flatten a nested json object - ruby

I'm looking for a method that will flatten a "json" hash into a flattened hash but keep the path information in the flattened keys.
For example:
h = {"a" => "foo", "b" => [{"c" => "bar", "d" => ["baz"]}]}
flatten(h) should return:
{"a" => "foo", "b_0_c" => "bar", "b_0_d_0" => "baz"}

This should solve your problem:
h = {'a' => 'foo', 'b' => [{'c' => 'bar', 'd' => ['baz']}]}
module Enumerable
def flatten_with_path(parent_prefix = nil)
res = {}
self.each_with_index do |elem, i|
if elem.is_a?(Array)
k, v = elem
else
k, v = i, elem
end
key = parent_prefix ? "#{parent_prefix}.#{k}" : k # assign key name for result hash
if v.is_a? Enumerable
res.merge!(v.flatten_with_path(key)) # recursive call to flatten child elements
else
res[key] = v
end
end
res
end
end
puts h.flatten_with_path.inspect

I'm having a similar question and raised it here
Best way to produce a flattened JSON (denormalize) out of hierarchical JSON in Ruby with a possible solution
Is my solution an optimal one or is there any better way?

Related

how can I programmatically identify which keys have sub key-value-pairs in a JSON doc? [duplicate]

This question already has answers here:
Flattening nested hash to a single hash with Ruby/Rails
(6 answers)
Closed 8 years ago.
I fetch a JSON document and need to programmatically "flatten" the keys for another third-party service.
What this means is, if my JSON doc comes back with the following:
{'first_name' => "Joe", 'hoffman' => {'patterns' => ['negativity', 'self-sabotage'], 'right_road' => 'happy family'}, 'mbti' => 'INTJ'}
I need to be able to know to create a "flat" key-value pair for a third-party service like this:
first_name = "Joe"
hoffman.patterns = "negativity, self-sabotage"
hoffman.right_road = "happy family"
mbti = "INTJ"
Once I know there's a sub-document, the parsing I think I have figured out just appending the sub-keys with key + '.' + "{subkey}" but right now, don't know which ones are straight key-value and which one's have sub-documents.
Question:
a) How can I parse the JSON to know which keys have sub-documents (additional key-values)?
b) Suggestions on ways to create a string from an array
You could also monkey patch Hash to do this on it's own like so:
class Hash
def flatten_keys(prefix=nil)
each_pair.map do |k,v|
key = [prefix,k].compact.join(".")
v.is_a?(Hash) ? v.flatten_keys(key) : [key,v.is_a?(Array) ? v.join(", ") : v]
end.flatten.each_slice(2).to_a
end
def to_flat_hash
Hash[flatten_keys]
end
end
Then it would be
require 'json'
h = JSON.parse(YOUR_JSON_RESPONSE)
#=> {'first_name' => "Joe", 'hoffman' => {'patterns' => ['negativity', 'self-sabotage'], 'right_road' => 'happy family'}, 'mbti' => 'INTJ'}
h.to_flat_hash
#=> {"first_name"=>"Joe", "hoffman.patterns"=>"negativity, self-sabotage", "hoffman.right_road"=>"happy family", "mbti"=>"INTJ"}
Will work with additional nesting too
h = {"first_name"=>"Joe", "hoffman"=>{"patterns"=>["negativity", "self-sabotage"], "right_road"=>"happy family", "wrong_road"=>{"bad_choices"=>["alcohol", "heroin"]}}, "mbti"=>"INTJ"}
h.to_flat_hash
#=> {"first_name"=>"Joe", "hoffman.patterns"=>"negativity, self-sabotage", "hoffman.right_road"=>"happy family", "hoffman.wrong_road.bad_choices"=>"alcohol, heroin", "mbti"=>"INTJ"}
Quick and dirty recursive proc:
# assuming you've already `JSON.parse` the incoming json into this hash:
a = {'first_name' => "Joe", 'hoffman' => {'patterns' => ['negativity', 'self-sabotage'], 'right_road' => 'happy family'}, 'mbti' => 'INTJ'}
# define a recursive proc:
flatten_keys = -> (h, prefix = "") do
#flattened_keys ||= {}
h.each do |key, value|
# Here we check if there's "sub documents" by asking if the value is a Hash
# we also pass in the name of the current prefix and key and append a . to it
if value.is_a? Hash
flatten_keys.call value, "#{prefix}#{key}."
else
# if not we concatenate the key and the prefix and add it to the #flattened_keys hash
#flattened_keys["#{prefix}#{key}"] = value
end
end
#flattened_keys
end
flattened = flatten_keys.call a
# => "first_name"=>"Joe", "hoffman.patterns"=>["negativity", "self-sabotage"], "hoffman.right_road"=>"happy family", "mbti"=>"INTJ"}
And then, to turn the arrays into strings just join them:
flattened.inject({}) do |hash, (key, value)|
value = value.join(', ') if value.is_a? Array
hash.merge! key => value
end
# => {"first_name"=>"Joe", "hoffman.patterns"=>"negativity, self-sabotage", "hoffman.right_road"=>"happy family", "mbti"=>"INTJ"}
Another way, inspired by this post:
def flat_hash(h,f=[],g={})
return g.update({ f=>h }) unless h.is_a? Hash
h.each { |k,r| flat_hash(r,f+[k],g) }
g
end
h = { :a => { :b => { :c => 1,
:d => 2 },
:e => 3 },
:f => 4 }
result = {}
flat_hash(h) #=> {[:a, :b, :c]=>1, [:a, :b, :d]=>2, [:a, :e]=>3, [:f]=>4}
.each{ |k, v| result[k.join('.')] = v } #=> {"a.b.c"=>1, "a.b.d"=>2, "a.e"=>3, "f"=>4}

Is there a better solution to partition a hash into two hashes?

I wrote a method to split a hash into two hashes based on a criteria (a particular hash value). My question is different from another question on Hash. Here is an example of what I expect:
h={
:a => "FOO",
:b => "FOO",
:c => "BAR",
:d => "BAR",
:e => "FOO"
}
h_foo, h_bar = partition(h)
I need h_foo and h_bar to be like:
h_foo={
:a => "FOO",
:b => "FOO",
:e => "FOO"
}
h_bar={
:c => "BAR",
:d => "BAR"
}
My solution is:
def partition h
h.group_by{|k,v| v=="FOO"}.values.collect{|ary| Hash[*ary.flatten]}
end
Is there a clever solution?
There's Enumerable#partition:
h.partition { |k, v| v == "FOO" }.map(&:to_h)
#=> [{:a=>"FOO", :b=>"FOO", :e=>"FOO"}, {:c=>"BAR", :d=>"BAR"}]
Or you could use Enumerable#each_with_object to avoid the intermediate arrays:
h.each_with_object([{}, {}]) { |(k, v), (h_foo, h_bar)|
v == "FOO" ? h_foo[k] = v : h_bar[k] = v
}
#=> [{:a=>"FOO", :b=>"FOO", :e=>"FOO"}, {:c=>"BAR", :d=>"BAR"}]
I don't think there is a clever one liner, but you can make it slightly more generic by doing something like:
def transpose(h,k,v)
h[v] ||= []
h[v] << k
end
def partition(h)
n = {}
h.map{|k,v| transpose(n,k,v)}
result = n.map{|k,v| Hash[v.map{|e| [e, k]}] }
end
which will yield
[{:a=>"FOO", :b=>"FOO", :e=>"FOO"}, {:c=>"BAR", :d=>"BAR"}]
when run against your initial hash h
Edit - TIL about partition. Wicked.
Why not use builtin partition, which is doing almost exactly what you are looking for?
h_foo, h_bar = h.partition { |key, value| value == 'FOO' }
The only downside is that you will get arrays instead of hashes (but you already know how to convert that). In ruby 2.1+ you could simply call .map(&:to_h) at the end of call chain.

Ruby / Remove everything after a matched key / array of hashes

Let's say I have the following array of hashes:
h = [{"name" => "bob"}, {"car" => "toyota"}, {"age" => "25"}]
And I have the following key to match:
k = 'car'
How do I match the 'k' to 'h' and have delete every element after the match so that it returns:
h = [{"name" => "bob"}, {"car" => "toyota"}]
Just convert hash to array, do your task and then convert back
h = {"name" => "bob", "car" => "toyota", "age" => "25"}
array = h.to_a.flatten
index = array.index('car') + 1
h = Hash[*array[0..index]]
=> {"name"=>"bob", "car"=>"toyota"}
By the way, the hash is ordered only since Ruby 1.9
ar = [{"name" => "bob"}, {"car" => "toyota"}, {"age" => "25"}]
p ar[0 .. ar.index{|h| h.key?('car')}] #=>[{"name"=>"bob"}, {"car"=>"toyota"}]
I like megas' version, as its short and to the point. Another approach, which would be more explicit, would be iterating over the keys array of each hash. The keys of a hash are maintained in an ordered array (http://ruby-doc.org/core-1.9.3/Hash.html). They are ordered by when they were first entered. As a result, you can try the following:
newArray = Array.new
h.each do |hash| # Iterate through your array of hashes
newArray << hash
if hash.has_key?("car") # check if this hash is the "car" hash.
break # exits the block
end
end
This all depends, of course, on whether the array was created in the proper order. If it was, then you're golden.
A hash is unordered set by definition, so what you request is somewhat undefined. However you can do something like a hack:
h = {"name" => "bob", "car" => "toyota", "age" => "25"}
matched = false
key_given = "car"
h.each do |k,v|
if matched
h.delete(k)
end
if k == key_given
matched = true
next
end
end
I'm pretty late to the party here. I was looking for a solution to this same problem, but I didn't love these answers. So, here's my approach:
class Array
def take_until(&blk)
i = find_index &blk
take(i + 1)
end
end
h = [{"name" => "bob"}, {"car" => "toyota"}, {"age" => "25"}]
k = 'car'
h.take_until { |x| x.has_key?(k) }
=> [{"name"=>"bob"}, {"car"=>"toyota"}]

How do I convert a Ruby hash so that all of its keys are symbols?

I have a Ruby hash which looks like:
{ "id" => "123", "name" => "test" }
I would like to convert it to:
{ :id => "123", :name => "test" }
hash = {"apple" => "banana", "coconut" => "domino"}
Hash[hash.map{ |k, v| [k.to_sym, v] }]
#=> {:apple=>"banana", :coconut=>"domino"}
#mu is too short: Didn't see word "recursive", but if you insist (along with protection against non-existent to_sym, just want to remind that in Ruby 1.8 1.to_sym == nil, so playing with some key types can be misleading):
hash = {"a" => {"b" => "c"}, "d" => "e", Object.new => "g"}
s2s =
lambda do |h|
Hash === h ?
Hash[
h.map do |k, v|
[k.respond_to?(:to_sym) ? k.to_sym : k, s2s[v]]
end
] : h
end
s2s[hash] #=> {:d=>"e", #<Object:0x100396ee8>=>"g", :a=>{:b=>"c"}}
If you happen to be in Rails then you'll have symbolize_keys:
Return a new hash with all keys converted to symbols, as long as they respond to to_sym.
and symbolize_keys! which does the same but operates in-place. So, if you're in Rails, you could:
hash.symbolize_keys!
If you want to recursively symbolize inner hashes then I think you'd have to do it yourself but with something like this:
def symbolize_keys_deep!(h)
h.keys.each do |k|
ks = k.to_sym
h[ks] = h.delete k
symbolize_keys_deep! h[ks] if h[ks].kind_of? Hash
end
end
You might want to play with the kind_of? Hash to match your specific circumstances; using respond_to? :keys might make more sense. And if you want to allow for keys that don't understand to_sym, then:
def symbolize_keys_deep!(h)
h.keys.each do |k|
ks = k.respond_to?(:to_sym) ? k.to_sym : k
h[ks] = h.delete k # Preserve order even when k == ks
symbolize_keys_deep! h[ks] if h[ks].kind_of? Hash
end
end
Note that h[ks] = h.delete k doesn't change the content of the Hash when k == ks but it will preserve the order when you're using Ruby 1.9+. You could also use the [(key.to_sym rescue key) || key] approach that Rails uses in their symbolize_keys! but I think that's an abuse of the exception handling system.
The second symbolize_keys_deep! turns this:
{ 'a' => 'b', 'c' => { 'd' => { 'e' => 'f' }, 'g' => 'h' }, ['i'] => 'j' }
into this:
{ :a => 'b', :c => { :d => { :e => 'f' }, :g => 'h' }, ['i'] => 'j' }
You could monkey patch either version of symbolize_keys_deep! into Hash if you really wanted to but I generally stay away from monkey patching unless I have very good reasons to do it.
If you are using Rails >= 4 you can use:
hash.deep_symbolize_keys
hash.deep_symbolize_keys!
or
hash.deep_stringify_keys
hash.deep_stringify_keys!
see http://apidock.com/rails/v4.2.1/Hash/deep_symbolize_keys
Just in case you are parsing JSON, from the JSON docs you can add the option to symbolize the keys upon parsing:
hash = JSON.parse(json_data, symbolize_names: true)
Victor Moroz provided a lovely answer for the simple recursive case, but it won't process hashes that are nested within nested arrays:
hash = { "a" => [{ "b" => "c" }] }
s2s[hash] #=> {:a=>[{"b"=>"c"}]}
If you need to support hashes within arrays within hashes, you'll want something more like this:
def recursive_symbolize_keys(h)
case h
when Hash
Hash[
h.map do |k, v|
[ k.respond_to?(:to_sym) ? k.to_sym : k, recursive_symbolize_keys(v) ]
end
]
when Enumerable
h.map { |v| recursive_symbolize_keys(v) }
else
h
end
end
Try this:
hash = {"apple" => "banana", "coconut" => "domino"}
# => {"apple"=>"banana", "coconut"=>"domino"}
hash.tap do |h|
h.keys.each { |k| h[k.to_sym] = h.delete(k) }
end
# => {:apple=>"banana", :coconut=>"domino"}
This iterates over the keys, and for each one, it deletes the stringified key and assigns its value to the symbolized key.
If you're using Rails (or just Active Support):
{ "id" => "123", "name" => "test" }.symbolize_keys
Starting with Ruby 2.5 you can use the transform_key method.
So in your case it would be:
h = { "id" => "123", "name" => "test" }
h.transform_keys!(&:to_sym) #=> {:id=>"123", :name=>"test"}
Note: the same methods are also available on Ruby on Rails.
Here's a Ruby one-liner that is faster than the chosen answer:
hash = {"apple" => "banana", "coconut" => "domino"}
#=> {"apple"=>"banana", "coconut"=>"domino"}
hash.inject({}){|h,(k,v)| h[k.intern] = v; h}
#=> {:apple=>"banana", :coconut=>"domino"}
Benchmark results:
n = 100000
Benchmark.bm do |bm|
bm.report { n.times { hash.inject({}){|h,(k,v)| h[k.intern] = v; h} } }
bm.report { n.times { Hash[hash.map{ |k, v| [k.to_sym, v] }] } }
end
# => user system total real
# => 0.100000 0.000000 0.100000 ( 0.107940)
# => 0.120000 0.010000 0.130000 ( 0.137966)
I'm partial to:
irb
ruby-1.9.2-p290 :001 > hash = {"apple" => "banana", "coconut" => "domino"}
{
"apple" => "banana",
"coconut" => "domino"
}
ruby-1.9.2-p290 :002 > hash.inject({}){ |h, (n,v)| h[n.to_sym] = v; h }
{
:apple => "banana",
:coconut => "domino"
}
This works because we're iterating over the hash and building a new one on the fly. It isn't recursive, but you could figure that out from looking at some of the other answers.
hash.inject({}){ |h, (n,v)| h[n.to_sym] = v; h }
You can also extend core Hash ruby class placing a /lib/hash.rb file :
class Hash
def symbolize_keys_deep!
new_hash = {}
keys.each do |k|
ks = k.respond_to?(:to_sym) ? k.to_sym : k
if values_at(k).first.kind_of? Hash or values_at(k).first.kind_of? Array
new_hash[ks] = values_at(k).first.send(:symbolize_keys_deep!)
else
new_hash[ks] = values_at(k).first
end
end
new_hash
end
end
If you want to make sure keys of any hash wrapped into arrays inside your parent hash are symbolized, you need to extend also array class creating a "array.rb" file with that code :
class Array
def symbolize_keys_deep!
new_ar = []
self.each do |value|
new_value = value
if value.is_a? Hash or value.is_a? Array
new_value = value.symbolize_keys_deep!
end
new_ar << new_value
end
new_ar
end
end
This allows to call "symbolize_keys_deep!" on any hash variable like this :
myhash.symbolize_keys_deep!
def symbolize_keys(hash)
new={}
hash.map do |key,value|
if value.is_a?(Hash)
value = symbolize_keys(value)
end
new[key.to_sym]=value
end
return new
end
puts symbolize_keys("c"=>{"a"=>2,"k"=>{"e"=>9}})
#{:c=>{:a=>2, :k=>{:e=>9}}}
Here's my two cents,
my version of symbolize_keys_deep! uses the original symbolize_keys! provided by rails and just makes a simple recursive call to Symbolize sub hashes.
def symbolize_keys_deep!(h)
h.symbolize_keys!
h.each do |k, v|
symbolize_keys_deep!(v) if v.is_a? Hash
end
end
Facets' Hash#rekey is also a worth mentioning.
Sample:
require 'facets/hash/rekey'
{ "id" => "123", "name" => "test" }.deep_rekey
=> {:id=>"123", :name=>"test"}
There is also a recursive version:
require 'facets/hash/deep_rekey'
{ "id" => "123", "name" => {"first" => "John", "last" => "Doe" } }.deep_rekey
=> {:id=>"123", :name=>{:first=>"John", :last=>"Doe"}}
Here's a little recursive function to do a deep symbolization of the keys:
def symbolize_keys(hash)
Hash[hash.map{|k,v| v.is_a?(Hash) ? [k.to_sym, symbolize_keys(v)] : [k.to_sym, v] }]
end

Recursive DFS Ruby method

I have a YAML file of groups that I would like to get into a MongoDB collection called groups with documents like {"name" => "golf", "parent" => "sports"} (Top level groups, like sports, would just be {"name" => "sports"} without a parent.)
We are trying to traverse the nested hash, but I'm not sure if it's working correctly. I'd prefer to use a recursive method than a lambda proc. What should we change to make it work?
Thanks!
Matt
Here's the working code:
require 'mongo'
require 'yaml'
conn = Mongo::Connection.new
db = conn.db("acani")
interests = db.collection("interests")
##interest_id = 0
interests_hash = YAML::load_file('interests.yml')
def interests.insert_interest(interest, parent=nil)
interest_id = ##interest_id.to_s(36)
if interest.is_a? String # base case
insert({:_id => interest_id, :n => interest, :p => parent})
##interest_id += 1
else # it's a hash
interest = interest.first # get key-value pair in hash
interest_name = interest[0]
insert({:_id => interest_id, :n => interest_name, :p => parent})
##interest_id += 1
interest[1].each do |i|
insert_interest(i, interest_name)
end
end
end
interests.insert_interest interests_hash
View the Interests YAML.
View the acani source.
Your question is just how to convert this code:
insert_enumerable = lambda {|obj, collection|
# obj = {:value => obj} if !obj.kind_of? Enumerable
if(obj.kind_of? Array or obj.kind_of? Hash)
obj.each do |k, v|
v = (v.nil?) ? k : v
insert_enumerable.call({:value => v, :parent => obj}, collection)
end
else
obj = {:value => obj}
end
# collection.insert({name => obj[:value], :parent => obj[:parent]})
pp({name => obj[:value], :parent => obj[:parent]})
}
...to use a method rather than a lambda? If so, then:
def insert_enumerable( obj, collection )
# obj = {:value => obj} if !obj.kind_of? Enumerable
if(obj.kind_of? Array or obj.kind_of? Hash)
obj.each do |k, v|
v = (v.nil?) ? k : v
insert_enumerable({:value => v, :parent => obj}, collection)
end
else
obj = {:value => obj}
end
# collection.insert({name => obj[:value], :parent => obj[:parent]})
pp({name => obj[:value], :parent => obj[:parent]})
end
If that's not what you're asking, please help clarify.

Resources