Recursive DFS Ruby method - ruby

I have a YAML file of groups that I would like to get into a MongoDB collection called groups with documents like {"name" => "golf", "parent" => "sports"} (Top level groups, like sports, would just be {"name" => "sports"} without a parent.)
We are trying to traverse the nested hash, but I'm not sure if it's working correctly. I'd prefer to use a recursive method than a lambda proc. What should we change to make it work?
Thanks!
Matt

Here's the working code:
require 'mongo'
require 'yaml'
conn = Mongo::Connection.new
db = conn.db("acani")
interests = db.collection("interests")
##interest_id = 0
interests_hash = YAML::load_file('interests.yml')
def interests.insert_interest(interest, parent=nil)
interest_id = ##interest_id.to_s(36)
if interest.is_a? String # base case
insert({:_id => interest_id, :n => interest, :p => parent})
##interest_id += 1
else # it's a hash
interest = interest.first # get key-value pair in hash
interest_name = interest[0]
insert({:_id => interest_id, :n => interest_name, :p => parent})
##interest_id += 1
interest[1].each do |i|
insert_interest(i, interest_name)
end
end
end
interests.insert_interest interests_hash
View the Interests YAML.
View the acani source.

Your question is just how to convert this code:
insert_enumerable = lambda {|obj, collection|
# obj = {:value => obj} if !obj.kind_of? Enumerable
if(obj.kind_of? Array or obj.kind_of? Hash)
obj.each do |k, v|
v = (v.nil?) ? k : v
insert_enumerable.call({:value => v, :parent => obj}, collection)
end
else
obj = {:value => obj}
end
# collection.insert({name => obj[:value], :parent => obj[:parent]})
pp({name => obj[:value], :parent => obj[:parent]})
}
...to use a method rather than a lambda? If so, then:
def insert_enumerable( obj, collection )
# obj = {:value => obj} if !obj.kind_of? Enumerable
if(obj.kind_of? Array or obj.kind_of? Hash)
obj.each do |k, v|
v = (v.nil?) ? k : v
insert_enumerable({:value => v, :parent => obj}, collection)
end
else
obj = {:value => obj}
end
# collection.insert({name => obj[:value], :parent => obj[:parent]})
pp({name => obj[:value], :parent => obj[:parent]})
end
If that's not what you're asking, please help clarify.

Related

Access to merged cells using Ruby-Roo

According to example below: Value is stored only in A1, other cells return nil.
How is possible to get the A1'a value from the others merged cells, or simply check range of the A1 cell?
here is my take, if all merged fields are same as prev - then non-merged fields should become array
xlsx = Roo::Excelx.new(__dir__ + "/output.xlsx", { expand_merged_ranges: true })
parsed = xlsx.sheet(0).parse(headers: true).drop(1)
parsed_merged = []
.tap do |parsed_merged|
parsed.each do |x|
if parsed_merged.empty?
parsed_merged << {
"field_non_merged1" => x["field_non_merged1"],
"field_merged1" => [x["field_merged1"]],
"field_merged2" => [x["field_merged2"]],
"field_merged3" => [x["field_merged3"]],
"field_merged4" => [x["field_merged4"]],
"field_non_merged2" => x["field_non_merged2"],
"field_non_merged3" => x["field_non_merged3"],
}
else
field_merged1_is_same_as_prev = x["field_non_merged1"] == parsed_merged.last["field_non_merged1"]
field_merged2_is_same_as_prev = x["field_non_merged2"] == parsed_merged.last["field_non_merged2"]
field_merged3_is_same_as_prev = x["field_non_merged3"] == parsed_merged.last["field_non_merged3"]
merged_rows_are_all_same_as_prev = field_non_merged1_is_same_as_prev && field_merged2_is_same_as_prev && field_merged3_is_same_as_prev
if merged_rows_are_all_same_as_prev
parsed_merged.last["field_merged1"].push x["field_merged1"]
parsed_merged.last["field_merged2"].push x["field_merged2"]
parsed_merged.last["field_merged3"].push x["field_merged3"]
parsed_merged.last["field_merged4"].push x["field_merged4"]
else
parsed_merged << {
"field_non_merged1" => x["field_non_merged1"],
"field_merged1" => [x["field_merged1"]],
"field_merged2" => [x["field_merged2"]],
"field_merged3" => [x["field_merged3"]],
"field_merged4" => [x["field_merged4"]],
"field_non_merged2" => x["field_non_merged2"],
"field_non_merged3" => x["field_non_merged3"],
}
end
end
end
end
.map do |x|
{
"field_non_merged1" => x["field_non_merged1"],
"field_merged1" => x["field_merged1"].compact.uniq,
"field_merged2" => x["field_merged2"].compact.uniq,
"field_merged3" => x["field_merged3"].compact.uniq,
"field_merged4" => x["field_merged4"].compact.uniq,
"field_non_merged2" => x["field_non_merged2"],
"field_non_merged3" => x["field_non_merged3"],
}
end
This is not possible without first assigning the value to all the cells of the range, even in Excel VBA this is the case.
See this sample
require 'axlsx'
p = Axlsx::Package.new
wb = p.workbook
wb.add_worksheet(:name => "Basic Worksheet") do |sheet|
sheet.add_row ["Val", nil]
sheet.add_row [nil, nil]
merged = sheet.merge_cells('A1:B2')
p sheet.rows[0].cells[0].value # "Val"
p sheet.rows[0].cells[1].value # nil
sheet[*merged].each{|cell|cell.value = sheet[*merged].first.value}
p sheet.rows[0].cells[0].value # "Val"
p sheet.rows[0].cells[1].value # "Val"
end
p.serialize('./simple.xlsx')
Please add a sample yourself next time so that we see which gem you used, which code, error etc.

how can I programmatically identify which keys have sub key-value-pairs in a JSON doc? [duplicate]

This question already has answers here:
Flattening nested hash to a single hash with Ruby/Rails
(6 answers)
Closed 8 years ago.
I fetch a JSON document and need to programmatically "flatten" the keys for another third-party service.
What this means is, if my JSON doc comes back with the following:
{'first_name' => "Joe", 'hoffman' => {'patterns' => ['negativity', 'self-sabotage'], 'right_road' => 'happy family'}, 'mbti' => 'INTJ'}
I need to be able to know to create a "flat" key-value pair for a third-party service like this:
first_name = "Joe"
hoffman.patterns = "negativity, self-sabotage"
hoffman.right_road = "happy family"
mbti = "INTJ"
Once I know there's a sub-document, the parsing I think I have figured out just appending the sub-keys with key + '.' + "{subkey}" but right now, don't know which ones are straight key-value and which one's have sub-documents.
Question:
a) How can I parse the JSON to know which keys have sub-documents (additional key-values)?
b) Suggestions on ways to create a string from an array
You could also monkey patch Hash to do this on it's own like so:
class Hash
def flatten_keys(prefix=nil)
each_pair.map do |k,v|
key = [prefix,k].compact.join(".")
v.is_a?(Hash) ? v.flatten_keys(key) : [key,v.is_a?(Array) ? v.join(", ") : v]
end.flatten.each_slice(2).to_a
end
def to_flat_hash
Hash[flatten_keys]
end
end
Then it would be
require 'json'
h = JSON.parse(YOUR_JSON_RESPONSE)
#=> {'first_name' => "Joe", 'hoffman' => {'patterns' => ['negativity', 'self-sabotage'], 'right_road' => 'happy family'}, 'mbti' => 'INTJ'}
h.to_flat_hash
#=> {"first_name"=>"Joe", "hoffman.patterns"=>"negativity, self-sabotage", "hoffman.right_road"=>"happy family", "mbti"=>"INTJ"}
Will work with additional nesting too
h = {"first_name"=>"Joe", "hoffman"=>{"patterns"=>["negativity", "self-sabotage"], "right_road"=>"happy family", "wrong_road"=>{"bad_choices"=>["alcohol", "heroin"]}}, "mbti"=>"INTJ"}
h.to_flat_hash
#=> {"first_name"=>"Joe", "hoffman.patterns"=>"negativity, self-sabotage", "hoffman.right_road"=>"happy family", "hoffman.wrong_road.bad_choices"=>"alcohol, heroin", "mbti"=>"INTJ"}
Quick and dirty recursive proc:
# assuming you've already `JSON.parse` the incoming json into this hash:
a = {'first_name' => "Joe", 'hoffman' => {'patterns' => ['negativity', 'self-sabotage'], 'right_road' => 'happy family'}, 'mbti' => 'INTJ'}
# define a recursive proc:
flatten_keys = -> (h, prefix = "") do
#flattened_keys ||= {}
h.each do |key, value|
# Here we check if there's "sub documents" by asking if the value is a Hash
# we also pass in the name of the current prefix and key and append a . to it
if value.is_a? Hash
flatten_keys.call value, "#{prefix}#{key}."
else
# if not we concatenate the key and the prefix and add it to the #flattened_keys hash
#flattened_keys["#{prefix}#{key}"] = value
end
end
#flattened_keys
end
flattened = flatten_keys.call a
# => "first_name"=>"Joe", "hoffman.patterns"=>["negativity", "self-sabotage"], "hoffman.right_road"=>"happy family", "mbti"=>"INTJ"}
And then, to turn the arrays into strings just join them:
flattened.inject({}) do |hash, (key, value)|
value = value.join(', ') if value.is_a? Array
hash.merge! key => value
end
# => {"first_name"=>"Joe", "hoffman.patterns"=>"negativity, self-sabotage", "hoffman.right_road"=>"happy family", "mbti"=>"INTJ"}
Another way, inspired by this post:
def flat_hash(h,f=[],g={})
return g.update({ f=>h }) unless h.is_a? Hash
h.each { |k,r| flat_hash(r,f+[k],g) }
g
end
h = { :a => { :b => { :c => 1,
:d => 2 },
:e => 3 },
:f => 4 }
result = {}
flat_hash(h) #=> {[:a, :b, :c]=>1, [:a, :b, :d]=>2, [:a, :e]=>3, [:f]=>4}
.each{ |k, v| result[k.join('.')] = v } #=> {"a.b.c"=>1, "a.b.d"=>2, "a.e"=>3, "f"=>4}

Flatten a nested json object

I'm looking for a method that will flatten a "json" hash into a flattened hash but keep the path information in the flattened keys.
For example:
h = {"a" => "foo", "b" => [{"c" => "bar", "d" => ["baz"]}]}
flatten(h) should return:
{"a" => "foo", "b_0_c" => "bar", "b_0_d_0" => "baz"}
This should solve your problem:
h = {'a' => 'foo', 'b' => [{'c' => 'bar', 'd' => ['baz']}]}
module Enumerable
def flatten_with_path(parent_prefix = nil)
res = {}
self.each_with_index do |elem, i|
if elem.is_a?(Array)
k, v = elem
else
k, v = i, elem
end
key = parent_prefix ? "#{parent_prefix}.#{k}" : k # assign key name for result hash
if v.is_a? Enumerable
res.merge!(v.flatten_with_path(key)) # recursive call to flatten child elements
else
res[key] = v
end
end
res
end
end
puts h.flatten_with_path.inspect
I'm having a similar question and raised it here
Best way to produce a flattened JSON (denormalize) out of hierarchical JSON in Ruby with a possible solution
Is my solution an optimal one or is there any better way?

How do I replace all the values in a hash with a new value?

Let's say I have an arbitrarily deep nested Hash h:
h = {
:foo => { :bar => 1 },
:baz => 10,
:quux => { :swozz => {:muux => 1000}, :grimel => 200 }
# ...
}
And let's say I have a class C defined as:
class C
attr_accessor :dict
end
How do I replace all nested values in h so that they are now C instances with the dict attribute set to that value? For instance, in the above example, I'd expect to have something like:
h = {
:foo => <C #dict={:bar => 1}>,
:baz => 10,
:quux => <C #dict={:swozz => <C #dict={:muux => 1000}>, :grimel => 200}>
# ...
}
where <C #dict = ...> represents a C instance with #dict = .... (Note that as soon as you reach a value which isn't nested, you stop wrapping it in C instances.)
def convert_hash(h)
h.keys.each do |k|
if h[k].is_a? Hash
c = C.new
c.dict = convert_hash(h[k])
h[k] = c
end
end
h
end
If we override inspect in C to give a more friendly output like so:
def inspect
"<C #dict=#{dict.inspect}>"
end
and then run with your example h this gives:
puts convert_hash(h).inspect
{:baz=>10, :quux=><C #dict={:grimel=>200,
:swozz=><C #dict={:muux=>1000}>}>, :foo=><C #dict={:bar=>1}>}
Also, if you add an initialize method to C for setting dict:
def initialize(d=nil)
self.dict = d
end
then you can reduce the 3 lines in the middle of convert_hash to just h[k] = C.new(convert_hash_h[k])
class C
attr_accessor :dict
def initialize(dict)
self.dict = dict
end
end
class Object
def convert_to_dict
C.new(self)
end
end
class Hash
def convert_to_dict
Hash[map {|k, v| [k, v.convert_to_dict] }]
end
end
p h.convert_to_dict
# => {
# => :foo => {
# => :bar => #<C:0x13adc18 #dict=1>
# => },
# => :baz => #<C:0x13adba0 #dict=10>,
# => :quux => {
# => :swozz => {
# => :muux => #<C:0x13adac8 #dict=1000>
# => },
# => :grimel => #<C:0x13ada50 #dict=200>
# => }
# => }

Comparing lists of field-hashes with equivalent AR-objects

I have a list of hashes, as such:
incoming_links = [
{:title => 'blah1', :url => "http://blah.com/post/1"},
{:title => 'blah2', :url => "http://blah.com/post/2"},
{:title => 'blah3', :url => "http://blah.com/post/3"}]
And an ActiveRecord model which has fields in the database with some matching rows, say:
Link.all =>
[<Link#2 #title='blah2' #url='...post/2'>,
<Link#3 #title='blah3' #url='...post/3'>,
<Link#4 #title='blah4' #url='...post/4'>]
I'd like to do set operations on Link.all with incoming_links so that I can figure out that <Link#4 ...> is not in the set of incoming_links, and {:title => 'blah1', :url =>'http://blah.com/post/1'} is not in the Link.all set, like so:
#pseudocode
#incoming_links = as above
links = Link.all
expired_links = links - incoming_links
missing_links = incoming_links - links
expired_links.destroy
missing_links.each{|link| Link.create(link)}
Crappy solution a):
I'd rather not rewrite Array#- and such, and I'm okay with converting incoming_links to a set of unsaved Link objects; so I've tried overwriting hash eql? and so on in Link so that it ignored the id equality that AR::Base provides by default. But this is the only place this sort of equality should be considered in the application - in other places the Link#id default identity is required. Is there some way I could subclass Link and apply the hash, eql?, etc overwriting there?
Crappy solution b):
The other route I've tried is to pull out the attributes hash for each Link and doing a .slice('id',...etc) to prune the hashes down. But this requires writing seperate - methods for keeping track of the Link objects while doing set operations on the hashes, and writing seperate Proxy classes to wrap the incoming_links hashes and Links, which seems a bit overkill. Nonetheless, this is the current solution for me.
Can you think of a better way to design this interaction? Extra credit for cleanliness.
try this
incoming_links = [
{:title => 'blah1', :url => "http://blah.com/post/1"},
{:title => 'blah2', :url => "http://blah.com/post/2"},
{:title => 'blah3', :url => "http://blah.com/post/3"}]
ar_links = Link.all(:select => 'title, url').map(&:attributes)
# wich incoming links are not in ar_links
incoming_links - ar_links
# and vice versa
ar_links - incoming_links
upd
For your Link model:
def self.not_in_array(array)
keys = array.first.keys
all.reject do |item|
hash = {}
keys.each { |k| hash[k] = item.send(k) }
array.include? hash
end
end
def self.not_in_class(array)
keys = array.first.keys
class_array = []
all.each do |item|
hash = {}
keys.each { |k| hash[k] = item.send(k) }
class_array << hash
end
array - class_array
end
ar = [{:title => 'blah1', :url => 'http://blah.com/ddd'}]
Link.not_in_array ar
#=> all links from Link model which not in `ar`
Link.not_in_class ar
#=> all links from `ar` which not in your Link model
If you rewrite the equality method, will ActiveRecord complain still?
Can't you do something similar to this (as in a regular ruby class):
class Link
attr_reader :title, :url
def initialize(title, url)
#title = title
#url = url
end
def eql?(another_link)
self.title == another_link.title and self.url == another_link.url
end
def hash
title.hash * url.hash
end
end
aa = [Link.new('a', 'url1'), Link.new('b', 'url2')]
bb = [Link.new('a', 'url1'), Link.new('d', 'url4')]
(aa - bb).each{|x| puts x.title}
The requirements are:
# Keep track of original link objects when
# comparing against a set of incomplete `attributes` hashes.
# Don't alter the `hash` and `eql?` methods of Link permanently,
# or globally, throughout the application.
The current solution is in effect using Hash's eql? method, and annotating the hashes with the original objects:
class LinkComp < Hash
LINK_COLS = [:title, :url]
attr_accessor :link
def self.[](args)
if args.first.is_a?(Link) #not necessary for the algorithm,
#but nice for finding typos and logic errors
links = args.collect do |lnk|
lk = super(lnk.attributes.slice(*(LINK_COLS.collect(&:to_s)).to_a)
lk.link = lnk
lk
end
elsif args.blank?
[]
#else #raise error for finding typos
end
end
end
incoming_links = [
{:title => 'blah1', :url => "http://blah.com/post/1"},
{:title => 'blah2', :url => "http://blah.com/post/2"},
{:title => 'blah3', :url => "http://blah.com/post/3"}]
#Link.all =>
#[<Link#2 #title='blah2' #url='...post/2'>,
# <Link#3 #title='blah3' #url='...post/3'>,
# <Link#4 #title='blah4' #url='...post/4'>]
incoming_links= LinkComp[incoming_links.collect{|i| Link.new(i)}]
links = LinkComp[Link.all] #As per fl00r's suggestion
#this could be :select'd down somewhat, w.l.o.g.
missing_links = (incoming_links - links).collect(&:link)
expired_links = (links - incoming_links).collect(&:link)

Resources