How to retain order in XML Array to Hash conversion? - ruby

I'm trying to parse XML in Ruby using Nori, which internally uses Nokogiri.
The XML has some tags repeated and the library parses repeated tags as Arrays and non-repeated tags as normal elements (Hash)
<nodes>
<foo>
<name>a</name>
</foo>
<bar>
<name>b</name>
</bar>
<baz>
<name>c</name>
</baz>
<foo>
<name>d</name>
</foo>
<bar>
<name>e</name>
</bar>
</nodes>
is parsed as
{nodes: {
foo: [{name: "a"}, {name: "d"}],
bar: [{name: "b"}, {name: "e"}],
baz: {name: "c"}
}}
How do i retain the order of elements in the resulting hash like the output below?
{nodes: [
{foo: {name: "a"}},
{bar: {name: "b"}},
{baz: {name: "c"}},
{foo: {name: "d"}},
{bar: {name: "e"}},
]}
(This may be a library specific question. But the intention is to know if anyone has faced a similar issue and how to parse it correctly)

Nori can't do this on its own. What you can do is tune the Nori output like this:
input = {nodes: {
foo: [{name: "a"}, {name: "d"}],
bar: [{name: "b"}, {name: "e"}],
baz: {name: "c"}
}}
def unfurl(hash)
out=[]
hash.each_pair{|k,v|
case v
when Array
v.each{|item|
out << {k => item}
}
else
out << {k => v}
end
}
return out
end
output = {:nodes => unfurl(input[:nodes])}
puts output.inspect
This prints the output that the original question requested which is different than the XML order:
{nodes: [
{foo: {name: "a"}},
{foo: {name: "d"}},
{bar: {name: "b"}},
{bar: {name: "e"}},
{baz: {name: "c"}},
]}

Related

adding key values inside of an array of hashes using .each

I'm wondering what the easiest method for adding 'string' values in an array of hashes using an each statement?
lost_boys = [
{name: 'Tootles', age: '11'},
{name: 'Nibs', age: '9'},
{name: 'Slightly', age: '10'},
{name: 'Curly', age: '8'},
{name: 'The Twins', age: '9'}
]
a = []
lost_boys.each do |ages|
a = ages[:age].to_i
puts a
end

Find & Replace elements in array of hashes - Ruby [duplicate]

This question already has an answer here:
Find and replace specific hash and it's values within array
(1 answer)
Closed 4 years ago.
I have an array of hashes as below
status_arr = [{id: 5, status: false},
{id: 7, status: false},
{id: 3, status: false},
{id: 9, status: false} ]
I would like to update the hash with status: true if it has ids 5, 7
update_ids = [5, 9]
I am trying the following and has no idea to proceed
status_arr.select{ |arr| update_ids.include?(arr[:id]) arr[:status] = true}
Expected output:
status_arr = [{id: 5, status: true},
{id: 7, status: false},
{id: 3, status: false},
{id: 9, status: true} ]
require 'set'
update_ids = Set.new([5,3])
status_arr.map{ |s| s[:status] = update_ids.include?(s[:id]); s }
#=> [{:id=>5, :status=>true}, {:id=>7, :status=>false}, {:id=>3, :status=>true}, {:id=>9, :status=>false}]
instead of Set you can use just a Hash
update_ids = {5 => true, 3=> true}
status_arr.map{ |s| s[:status] = update_ids.include?(s[:id]); s }
#=> [{:id=>5, :status=>true}, {:id=>7, :status=>false}, {:id=>3, :status=>true}, {:id=>9, :status=>false}]
Or an array, but it will have some performance issues for big arrays
update_ids = [5,3]
status_arr.map{ |s| s[:status] = update_ids.include?(s[:id]); s }
#=> [{:id=>5, :status=>true}, {:id=>7, :status=>false}, {:id=>3, :status=>true}, {:id=>9, :status=>false}]

adding a reduce_with_index iterator to ruby

I am using ActiveRecord. It has a handy method called group_by. When I use it with my activerecord objects, i get the below hash:
{["junior"]=>[#<Lead id: 1, created_at: "2015-02-13 02:34:39", updated_at: "2015-02-13 02:35:27", case_enabled: true>, #<Lead id: 2, created_at: "2015-02-13 20:48:19", updated_at: "2015-02-13 20:48:19", case_enabled: nil>, ["senior"]=>[#<Lead id: 3, created_at: "2015-02-13 20:48:19", updated_at: "2015-02-13 20:48:19", case_enabled: nil>, #<Lead id: 4, created_at: "2015-02-13 20:49:16", updated_at: "2015-02-13 20:49:16", case_enabled: nil>]}
However, I want a hash with subhashes that contain the collection as an ActiveRecord::Relation and column data. So this is what I come up with:
i = 0
r = group.reduce({}) do |acc, (k,v)|
h = {}
active_record_relation = where(id: v.map(&:id))
h["#{k.first}_collection"] = active_record_relation
h["#{k.first}_columns"] = Classification.where(code: k.first).first.default_fields
acc[i] = h
i += 1
acc
end
And it gives me the results I want:
{0=>{"junior_collection"=>#<ActiveRecord::Relation [# ... ]>, "junior_columns"=>[ ... ]}, 1=>{"senior_collection"=>#<ActiveRecord::Relation [# ... ]>, "senior_columns"=>[ ... ]}}
The fact that I had to add the i variable makes me feel like this is not the ruby way to do this. But I looked at the docs and I didn't find a way to add an index to reduce, since I am already passing a hash into reduce. Is there another way?
Your way is probably good enough but you can avoid separately tracking the index by doing .each.with_index.reduce(...) { |acc, ((k,v),i)| ... }, like so:
h = {'a' => 'b', 'c' => 'd', 'e' => 'f'}
h.each.with_index.reduce('OK') do |acc, ((k, v), i)|
puts "acc=#{acc}, k=#{k}, v=#{v}, i=#{i}"
acc
end
# acc=OK, k=a, v=b, i=0
# acc=OK, k=c, v=d, i=1
# acc=OK, k=e, v=f, i=2
# => "OK"
Not sure if it's more Rubyish than your way =\

Merge duplicates in array of hashes

I have an array of hashes in ruby:
[
{name: 'one', tags: 'xxx'},
{name: 'two', tags: 'yyy'},
{name: 'one', tags: 'zzz'},
]
and i'm looking for any clean ruby solution, which will make it able to simply merge all the duplicates in that array (by merging i mean concatinating the tags param) so the above example will be transformed to:
[
{name: 'one', tags: 'xxx, zzz'},
{name: 'two', tags: 'yyy'},
]
I can iterate through each array element, check if there is a duplicate, merge it with the original entry and delete the duplicate but i feel that there can be a better solution for this and that there are some caveats in such approach i don't know about. Thanks for any clue.
I can think of as
arr = [
{name: 'one', tags: 'xxx'},
{name: 'two', tags: 'yyy'},
{name: 'one', tags: 'zzz'},
]
merged_array_hash = arr.group_by { |h1| h1[:name] }.map do |k,v|
{ :name => k, :tags => v.map { |h2| h2[:tags] }.join(" ,") }
end
merged_array_hash
# => [{:name=>"one", :tags=>"xxx ,zzz"}, {:name=>"two", :tags=>"yyy"}]
Here's a way that makes use of the form of Hash#update (aka Hash.merge!) that takes a block for determining the merged value for every key that is present in both of the two hashes being merged.
Code
def combine(a)
a.each_with_object({}) { |g,h| h.update({ g[:name]=>g }) { |k,hv,gv|
{ name: k, tags: hv[:tags]+", "+gv[:tags] } } }.values
end
Example
a = [{name: 'one', tags: 'uuu'},
{name: 'two', tags: 'vvv'},
{name: 'one', tags: 'www'},
{name: 'six', tags: 'xxx'},
{name: 'one', tags: 'yyy'},
{name: 'two', tags: 'zzz'}]
combine(a)
#=> [{:name=>"one", :tags=>"uuu, www, yyy"},
# {:name=>"two", :tags=>"vvv, zzz" },
# {:name=>"six", :tags=>"xxx" }]
Explanation
Suppose
a = [{name: 'one', tags: 'uuu'},
{name: 'two', tags: 'vvv'},
{name: 'one', tags: 'www'}]
b = a.each_with_object({})
#=> #<Enumerator: [{:name=>"one", :tags=>"uuu"},
# {:name=>"two", :tags=>"vvv"},
# {:name=>"one", :tags=>"www"}]:each_with_object({})>
We can convert the enumerator b to an array to see what values it will pass into its block:
b.to_a
#=> [[{:name=>"one", :tags=>"uuu"}, {}],
# [{:name=>"two", :tags=>"vvv"}, {}],
# [{:name=>"one", :tags=>"www"}, {}]]
The first value passed to the block and assigned to the block variables is:
g,h = [{:name=>"one", :tags=>"uuu"}, {}]
g #=> {:name=>"one", :tags=>"uuu"}
h #=> {}
The first merge operation is now performed (the merged h is returned):
h.update({ g[:name] => g })
#=> h.update({ "one" => {:name=>"one", :tags=>"uuu"} })
#=> {"one"=>{:name=>"one", :tags=>"uuu"}}
h does not have the key "one", so update's block is not involed.
Next, the enumerator b passes the following into the block:
g #=> {:name=>"two", :tags=>"vvv"}
h #=> {"one"=>{:name=>"one", :tags=>"uuu"}}
so we execute:
h.update({ g[:name] => g })
#=> h.update({ "two"=>{:name=>"two", :tags=>"vvv"})
#=> {"one"=>{:name=>"one", :tags=>"uuu"},
# "two"=>{:name=>"two", :tags=>"vvv"}}
Again, h does not have the key "two", so the block is not used.
Lastly, each_with_object passes the final tuple into the block:
g #=> {:name=>"one", :tags=>"www"}
h #=> {"one"=>{:name=>"one", :tags=>"uuu"},
# "two"=>{:name=>"two", :tags=>"vvv"}}
and we execute:
h.update({ g[:name] => g })
#=> h.update({ "one"=>{:name=>"one", :tags=>"www"})
h has a key/value pair with key "one":
"one"=>{:name=>"one", :tags=>"uuu"}
update's block is therefore executed to determine the merged value. The following values are passed to that block's variables:
k #=> "one"
hv #=> {:name=>"one", :tags=>"uuu"} <h's value for "one">
gv #=> {:name=>"one", :tags=>"www"} <g's value for "one">
and the block calculation creates this hash (as the merged value for the key "one"):
{ name: k, tags: hv[:tags]+", "+gv[:tags] }
#=> { name: "one", tags: "uuu" + ", " + "www" }
#=> { name: "one", tags: "uuu, www" }
So the merged hash now becomes:
h #=> {"one"=>{:name=>"one", :tags=>"uuu, www"},
# "two"=>{:name=>"two", :tags=>"vvv" }}
All that remains is to extract the values:
h.values
#=> [{:name=>"one", :tags=>"uuu, www"}, {:name=>"two", :tags=>"vvv"}]

Removing hashes that have identical values for particular keys

I have an Array of Hashes with the same keys, storing people's data.
I want to remove the hashes that have the same values for the keys :name and :surname. The rest of the values can differ, so calling uniq! on array won't work.
Is there a simple solution for this?
You can pass a block to uniq or uniq!, the value returned by the block is used to compare two entries for equality:
irb> people = [{name: 'foo', surname: 'bar', age: 10},
{name: 'foo', surname: 'bar' age: 11}]
irb> people.uniq { |p| [p[:name], p[:surname]] }
=> [{:name=>"foo", :surname=>"bar", :age=>10}]
arr=[{name: 'john', surname: 'smith', phone:123456789},
{name: 'thomas', surname: 'hardy', phone: 671234992},
{name: 'john', surname: 'smith', phone: 666777888}]
# [{:name=>"john", :surname=>"smith", :phone=>123456789},
# {:name=>"thomas", :surname=>"hardy", :phone=>671234992},
# {:name=>"john", :surname=>"smith", :phone=>666777888}]
arr.uniq {|h| [h[:name], h[:surname]]}
# [{:name=>"john", :surname=>"smith", :phone=>123456789},
# {:name=>"thomas", :surname=>"hardy", :phone=>671234992}]
unique_people = {}
person_array.each do |person|
unique_people["#{person[:name]} #{person[:surname]}"] = person
end
array_of_unique_people = unique_people.values
This should do the trick.
a.delete_if do |h|
a.select{|i| i[:name] == h[:name] and i[:surname] == h[:surname] }.count > 1
end

Resources