build a hash from iterating over a hash with nested arrays - ruby

I'd like to structure data I get pack from an Instagram API call:
{"attribution"=>nil,
"tags"=>["loudmouth"],
"location"=>{"latitude"=>40.7181015, "name"=>"Fontanas Bar", "longitude"=>-73.9922791, "id"=>31443955},
"comments"=>{"count"=>0, "data"=>[]},
"filter"=>"Normal",
"created_time"=>"1444181565",
"link"=>"https://instagram.com/p/8hJ-UwIDyC/",
"likes"=>{"count"=>0, "data"=>[]},
"images"=>
{"low_resolution"=>{"url"=>"https://scontent.cdninstagram.com/hphotos-xaf1/t51.2885-15/s320x320/e35/12145134_169501263391761_636095824_n.jpg", "width"=>320, "height"=>320},
"thumbnail"=>
{"url"=>"https://scontent.cdninstagram.com/hphotos-xfa1/t51.2885-15/s150x150/e35/c135.0.810.810/12093266_813307028768465_178038954_n.jpg", "width"=>150, "height"=>150},
"standard_resolution"=>
{"url"=>"https://scontent.cdninstagram.com/hphotos-xaf1/t51.2885-15/s640x640/sh0.08/e35/12145134_169501263391761_636095824_n.jpg", "width"=>640, "height"=>640}},
"users_in_photo"=>
[{"position"=>{"y"=>0.636888889, "x"=>0.398666667},
"user"=>
{"username"=>"ambersmelson",
"profile_picture"=>"http://photos-h.ak.instagram.com/hphotos-ak-xfa1/t51.2885-19/11909108_1492226137759631_1159527917_a.jpg",
"id"=>"194780705",
"full_name"=>""}}],
"caption"=>
{"created_time"=>"1444181565",
"text"=>"the INCOMPARABLE Amber Nelson closing us out! #loudmouth",
"from"=>
{"username"=>"alex3nglish",
"profile_picture"=>"http://photos-f.ak.instagram.com/hphotos-ak-xaf1/t51.2885-19/s150x150/11906214_483262888501413_294704768_a.jpg",
"id"=>"30822062",
"full_name"=>"Alex English"}}
I'd like to structure it in this way:
hash ={}
hash {"item1"=>
:location => {"latitude"=>40.7181015, "name"=>"Fontanas Bar", "longitude"=>-73.9922791, "id"=>31443955},
:created_time => "1444181565",
:images =>https://scontent.cdninstagram.com/hphotos-xaf1/t51.2885-15/s320x320/e35/12145134_169501263391761_636095824_n.jpg"
:user =>"Alex English"}
I'm iterating over 20 objects, each with their location, images, etc... how can I get a hash structure like the one above ?
This is what I've tried:
array_images = Array.new
# iterate through response object to extract what is needed
response.each do |item|
array_images << { :image => item.images.low_resolution.url,
:location => item.location,:created_time => Time.at(item.created_time.to_i), :user => item.user.full_name}
end
Which works fine. So what is the better way, the fastest one?

The hash that you gave is one item in the array stored at the key "data" in a larger hash right? At least that's how it is for the tags/ endpoint so I'll assume it's the same here. (I'm referring to that array of hashes as data)
hash = {}
data.each_with_index do |h, idx|
hash["item#{idx + 1}"] = {
location: h["location"], #This grabs the entire hash at "location" because you are wanting all of that data
created_time: h["created_time"],
image: h["images"]["low_resolution"]["url"], # You can replace this with whichever resolution.
caption: h["caption"]["from"]["full_name"]
}
end
I feel like you want a more simple solution, but I'm not sure how that's going to happen as you want things nested at different levels and you are pulling things from diverse levels of nesting.

Related

Create a Ruby Hash out of an xml string with the 'ox' gem

I am currently trying to create a hash out of an xml documen, with the help of the ox gem
Input xml:
<?xml version="1.0"?>
<expense>
<payee>starbucks</payee>
<amount>5.75</amount>
<date>2017-06-10</date>
</expense>
with the following ruby/ox code:
doc = Ox.parse(xml)
plist = doc.root.nodes
I get the following output:
=> [#<Ox::Element:0x00007f80d985a668 #value="payee", #attributes={}, #nodes=["starbucks"]>, #<Ox::Element:0x00007f80d9839198 #value="amount", #attributes={}, #nodes=["5.75"]>, #<Ox::Element:0x00007f80d9028788 #value="date", #attributes={}, #nodes=["2017-06-10"]>]
The output I want is a hash in the format:
{'payee' => 'Starbucks',
'amount' => 5.75,
'date' => '2017-06-10'}
to save in my sqllite database. How can I transform the objects array into a hash like above.
Any help is highly appreciated.
The docs suggest you can use the following:
require 'ox'
xml = %{
<top name="sample">
<middle name="second">
<bottom name="third">Rock bottom</bottom>
</middle>
</top>
}
puts Ox.load(xml, mode: :hash)
puts Ox.load(xml, mode: :hash_no_attrs)
#{:top=>[{:name=>"sample"}, {:middle=>[{:name=>"second"}, {:bottom=>[{:name=>"third"}, "Rock bottom"]}]}]}
#{:top=>{:middle=>{:bottom=>"Rock bottom"}}}
I'm not sure that's exactly what you're looking for though.
Otherwise, it really depends on the methods available on the Ox::Element instances in the array.
From the docs, it looks like there are two handy methods here: you can use [] and text.
Therefore, I'd use reduce to coerce the array into the hash format you're looking for, using something like the following:
ox_nodes = [#<Ox::Element:0x00007f80d985a668 #value="payee", #attributes={}, #nodes=["starbucks"]>, #<Ox::Element:0x00007f80d9839198 #value="amount", #attributes={}, #nodes=["5.75"]>, #<Ox::Element:0x00007f80d9028788 #value="date", #attributes={}, #nodes=["2017-06-10"]>]
ox_nodes.reduce({}) do |hash, node|
hash[node['#value']] = node.text
hash
end
I'm not sure whether node['#value'] will work, so you might need to experiment with that - otherwise perhaps node.instance_variable_get('#value') would do it.
node.text does the following, which sounds about right:
Returns the first String in the elements nodes array or nil if there is no String node.
N.B. I prefer to tidy the reduce block a little using tap, something like the following:
ox_nodes.reduce({}) do |hash, node|
hash.tap { |h| h[node['#value']] = node.text }
end
Hope that helps - let me know how you get on!
I found the answer to the question in my last comment by myself:
def create_xml(expense)
Ox.default_options=({:with_xml => false})
doc = Ox::Document.new(:version => '1.0')
expense.each do |key, value|
e = Ox::Element.new(key)
e << value
doc << e
end
Ox.dump(doc)
end
The next question would be how can i transform the value of the amount key from a string to an integer befopre saving it to the database

Ruby on Rails 4: Pluck results to hash

How can I turn:
Person.all.pluck(:id, :name)
to
[{id: 1, name: 'joe'}, {id: 2, name: 'martin'}]
without having to .map every value (since when I add or remove from the .pluck I have to do he same with the .map)
You can map the result:
Person.all.pluck(:id, :name).map { |id, name| {id: id, name: name}}
As mentioned by #alebian:
This is more efficient than
Person.all.as_json(only: [:id, :name])
Reasons:
pluck only returns the used columns (:id, :name) whereas the other solution returns all columns. Depending on the width of the table (number of columns) this makes quite a difference
The pluck solution does not instantiate Person objects, does not need to assign attributes to the models and so on. Instead it just returns an array with one integer and one string.
as_json again has more overhead than the simple map as it is a generic implementation to convert a model to a hash
You could simply do this
Person.select(:id,:name).as_json
You could try this as well
Person.all.as_json(only: [:id, :name])
I see three options:
1) pluck plus map:
Person.pluck(:id, :name).map { |p| { id: p[0], name: p[1] } }
2) pluck plus map plus zip and a variable to make it DRY-er:
attrs = %w(id name)
Person.pluck(*attrs).map { |p| attrs.zip(p).to_h }
3) or you might not use pluck at all although this is much less performant:
Person.all.map { |p| p.slice(:id, :name) }
If you use postgresql, you can use json_build_object function in pluck method:
https://www.postgresql.org/docs/9.5/functions-json.html
That way, you can let db create hashes.
Person.pluck("json_build_object('id', id, 'name', name)")
#=> [{id: 1, name: 'joe'}, {id: 2, name: 'martin'}]
Could go for a hash after the pluck with the ID being the key and the Name being the value:
Person.all.pluck(:id, :name).to_h
{ 1 => 'joe', 2 => 'martin' }
Not sure if this fits your needs, but presenting as an option.
You can use the aptly-named pluck_to_hash gem for this:
https://github.com/girishso/pluck_to_hash
It will extend AR with pluck_to_hash method that works like this:
Post.limit(2).pluck_to_hash(:id, :title)
#
# [{:id=>213, :title=>"foo"}, {:id=>214, :title=>"bar"}]
#
Post.limit(2).pluck_to_hash(:id)
#
# [{:id=>213}, {:id=>214}]
It claims to be several times faster than using AR select and as_json
There is pluck_all gem that do almost the same thing as pluck_to_hash do. And it claims that it's 30% faster. (see the benchmark here).
Usage:
Person.pluck_all(:id, :name)
If you have multiple attributes, you may do this for cleanliness:
Item.pluck(:id, :name, :description, :cost, :images).map do |item|
{
id: item[0],
name: item[1],
description: item[2],
cost: item[3],
images: item[4]
}
end
The easiest way is to use the pluck method combined with the zip method.
attrs_array = %w(id name)
Person.all.pluck(attrs_array).map { |ele| attrs_array.zip(ele).to_h }
You can also create a helper method if you are using this method through out your application.
def pluck_to_hash(object, *attrs)
object.pluck(*attrs).map { |ele| attrs.zip(ele).to_h }
end
Consider modifying by declaring self as the default receiver rather than passing Person.all as the object variable.
Read more about zip.
Here is a method that has worked well for me:
def pluck_to_hash(enumerable, *field_names)
enumerable.pluck(*field_names).map do |field_values|
field_names.zip(field_values).each_with_object({}) do |(key, value), result_hash|
result_hash[key] = value
end
end
end
I know it's an old thread but in case someone is looking for simpler version of this
Hash[Person.all(:id, :name)]
Tested in Rails 5.

Ruby find key by name inside converted JSON array of hashes

I have a Ruby hash converted from JSON data, it looks like this:
{ :query => {
:pages => {
:"743958" => {
:pageid => 743958,
:ns => 0,
:title => "Asterix the Gaul",
:revisions => [ {
:contentformat => "text/x-wiki",
:contentmodel => "wikitext",
:* => "{{Cleanup|date=April 2010}}\n{{Infobox graphic novel\n<!--Wikipedia:WikiProject Comics-->...
All the good stuff is inside the revisions array and then the Infobox hash.
The problem I have is getting to the Infobox hash. I can't seem to get to it. The pages and pageid hashes might not exist for other entries and of course the ID would be different.
I've tried all sorts of methods I could think of like .map, .select, .find, .include?, etc to no avail because they are not recursive and will not go into each key and array.
And all the answers I've seen in StackOverflow are to get the value by name inside a one-dimensional array which doesn't help.
How can I get the Infobox data from this?
Is this what you're looking for?
pp data
=> {:query=> {:pages=>
{:"743958"=>
{:pageid=>743958,
:ns=>0,
:title=>"Asterix the Gaul",
:revisions=>
[{:contentformat=>"text/x-wiki",
:contentmodel=>"wikitext",
:*=>"{{Cleanup..."}]}}}}
# just return data from the first revisionb
data[:query][:pages].map{|page_id,page_hash| page_hash[:revisions].first[:"*"]}
=> ["{{Cleanup..."]
# get data from all revisions
data[:query][:pages].map{|page_id,page_hash| page_hash[:revisions].map{|revision| revision[:"*"] }}.flatten
=> ["{{Cleanup..."]

Breaking out variably deeply nested hashes into separate hashes with Ruby

I'm pretty new to Ruby, but I've done a ton of searches, research here on Stack, and experimentation.
I'm getting POST data that contains variable information which I am able to convert into a hash from XML.
My objectives are to:
Get and store the parentage key hierarchy.
I'm creating MongoDb records of what I get via these POSTs, and I need to record what keys I get storing any new ones I get that aren't already part of the collections keys.
Once I have the key hierarchy stored, I need to take the nested hash and break out each top level key and its children into another hash. These will end up as individual subdocuments in a MongoDb record.
A big obstacle is that I won't know the hierarchy structure or any of the key names up front, so I have to create an parser that doesn't really care what is in the hash, it just organizes the key structure, and breaks the hash up into separate hashes representing each 'top level' key contained in a hash.
I have a nested hash:
{"hashdata"=>
{"ComputersCount"=>
{"Total"=>1, "Licensed"=>1, "ByOS"=>{"OS"=>{"Windows 7 x64"=>1}}},
"ScansCount"=>
{"Total"=>8,
"Scheduled"=>8,
"Agent"=>0,
"ByScanningProfile"=>{"Profile"=>{"Missing Patches"=>8}}},
"RemediationsCount"=>{"Total"=>1, "ByType"=>{"Type"=>{"9"=>1}}},
"AgentsCount"=>{"Total"=>0},
"RelaysCount"=>{"Total"=>0},
"ScanResultsDatabase"=>{"Type"=>"MSAccess"}}}
In this example, ignoring the 'hashdata' key, the 'top level' parents are:
ComputersCount
ScansCount
RemediationsCount
RelaysCount
ScanResultsDatabase
So ideally, I would end up with a hash of each parent key and its children keys, and a separate hash for each of the top level parents.
EDIT: I'm not sure the best way to articulate the 'keys hash' but I know it needs to contain a sense of the hierarchy structure with regards to what level and parent a key in the structure might have.
For the separate hashes themselves it could be as simple as:
{"ComputersCount"=>{"Total"=>1, "Licensed"=>1, "ByOS"=>{"OS"=>{"Windows 7 x64"=>1}}}}
{"ScansCount"=>{"Total"=>8,"Scheduled"=>8,"Agent"=>0,"ByScanningProfile"=>{"Profile"=>{"Missing Patches"=>8}}}}
{"RemediationsCount"=>{"Total"=>1, "ByType"=>{"Type"=>{"9"=>1}}}}
{"AgentsCount"=>{"Total"=>0}}
{"RelaysCount"=>{"Total"=>0}}
{"ScanResultsDatabase"=>{"Type"=>"MSAccess"}}}
My ultimate goal is to take the key collections and the hash collections and store them in MongoDb, each sub hash is a sub-document, and the keys collection gives me a column name map for the collection so it can be queried against later.
I've come close to a solution using some recursive methods for example:
def recurse_hash(h,p=nil)
h.each_pair do |k,v|
case v
when String, Fixnum then
p "Key: #{k}, Value: #{v}"
when Hash then
h.find_all_values_for(v)
recurse_hash(v,k)
else raise ArgumentError "Unhandled type #{v.class}"
end
end
end
But so far, I've only been able to get close to what I'm after. Ultimately, I need to be prepared to get hashes with any level of nesting or value structures because the POST data is highly variable.
Any advice, guidance or other assistance here would be greatly appreciated - I realize I could very well be approaching this entire challenge incorrectly.
Looks like you want an array of hashes like the following:
array = hash["hashdata"].map { |k,v| { k => v } }
# => [{"ComputersCount"=>{"Total"=>1, "Licensed"=>1, "ByOS"=>{"OS"=>{"Windows 7 x64"=>1}}}}, ... ]
array.first
# => {"ComputersCount"=>{"Total"=>1, "Licensed"=>1, "ByOS"=>{"OS"=>{"Windows 7 x64"=>1}}}}
array.last
# => {"ScanResultsDatabase"=>{"Type"=>"MSAccess"}}
Here's my best guess at the "key structure hierarchy and parentage."
I gently suggest that it is overkill.
Instead, I think that all you really need to do is just store your hashdata directly as MongoDB documents.
Even if your POST data is highly variable,
in all likelyhood it will still be sufficiently well-formed that you can write your application without difficulty.
Here's a test that incorporates "key structure hierarcy and parentage",
but maybe more importantly just shows how trivial it is to store your hashdata directly as a MongoDB document.
The test is run twice to demonstrate new key discovery.
test.rb
require 'mongo'
require 'test/unit'
require 'pp'
def key_structure(h)
h.keys.sort.collect{|k| v = h[k]; v.is_a?(Hash) ? [k, key_structure(h[k])] : k}
end
class MyTest < Test::Unit::TestCase
def setup
#hash_data_coll = Mongo::MongoClient.new['test']['hash_data']
#hash_data_coll.remove
#keys_coll = Mongo::MongoClient.new['test']['keys']
end
test "extract cancer drugs" do
hash_data = {
"hashdata" =>
{"ComputersCount" =>
{"Total" => 1, "Licensed" => 1, "ByOS" => {"OS" => {"Windows 7 x64" => 1}}},
"ScansCount" =>
{"Total" => 8,
"Scheduled" => 8,
"Agent" => 0,
"ByScanningProfile" => {"Profile" => {"Missing Patches" => 8}}},
"RemediationsCount" => {"Total" => 1, "ByType" => {"Type" => {"9" => 1}}},
"AgentsCount" => {"Total" => 0},
"RelaysCount" => {"Total" => 0},
"ScanResultsDatabase" => {"Type" => "MSAccess"}}}
known_keys = #keys_coll.find.to_a.collect{|doc| doc['key']}.sort
puts "known keys: #{known_keys}"
hash_data_keys = hash_data['hashdata'].keys.sort
puts "hash data keys: #{hash_data_keys.inspect}"
new_keys = hash_data_keys - known_keys
puts "new keys: #{new_keys.inspect}"
#keys_coll.insert(new_keys.collect{|key| {key: key, structure: key_structure(hash_data['hashdata'][key]), timestamp: Time.now}}) unless new_keys.empty?
pp #keys_coll.find.to_a unless new_keys.empty?
#hash_data_coll.insert(hash_data['hashdata'])
assert_equal(1, #hash_data_coll.count)
pp #hash_data_coll.find.to_a
end
end
$ ruby test.rb
Loaded suite test
Started
known keys: []
hash data keys: ["AgentsCount", "ComputersCount", "RelaysCount", "RemediationsCount", "ScanResultsDatabase", "ScansCount"]
new keys: ["AgentsCount", "ComputersCount", "RelaysCount", "RemediationsCount", "ScanResultsDatabase", "ScansCount"]
[{"_id"=>BSON::ObjectId('535976177f11ba278d000001'),
"key"=>"AgentsCount",
"structure"=>["Total"],
"timestamp"=>2014-04-24 20:37:43 UTC},
{"_id"=>BSON::ObjectId('535976177f11ba278d000002'),
"key"=>"ComputersCount",
"structure"=>[["ByOS", [["OS", ["Windows 7 x64"]]]], "Licensed", "Total"],
"timestamp"=>2014-04-24 20:37:43 UTC},
{"_id"=>BSON::ObjectId('535976177f11ba278d000003'),
"key"=>"RelaysCount",
"structure"=>["Total"],
"timestamp"=>2014-04-24 20:37:43 UTC},
{"_id"=>BSON::ObjectId('535976177f11ba278d000004'),
"key"=>"RemediationsCount",
"structure"=>[["ByType", [["Type", ["9"]]]], "Total"],
"timestamp"=>2014-04-24 20:37:43 UTC},
{"_id"=>BSON::ObjectId('535976177f11ba278d000005'),
"key"=>"ScanResultsDatabase",
"structure"=>["Type"],
"timestamp"=>2014-04-24 20:37:43 UTC},
{"_id"=>BSON::ObjectId('535976177f11ba278d000006'),
"key"=>"ScansCount",
"structure"=>
["Agent",
["ByScanningProfile", [["Profile", ["Missing Patches"]]]],
"Scheduled",
"Total"],
"timestamp"=>2014-04-24 20:37:43 UTC}]
[{"_id"=>BSON::ObjectId('535976177f11ba278d000007'),
"ComputersCount"=>
{"Total"=>1, "Licensed"=>1, "ByOS"=>{"OS"=>{"Windows 7 x64"=>1}}},
"ScansCount"=>
{"Total"=>8,
"Scheduled"=>8,
"Agent"=>0,
"ByScanningProfile"=>{"Profile"=>{"Missing Patches"=>8}}},
"RemediationsCount"=>{"Total"=>1, "ByType"=>{"Type"=>{"9"=>1}}},
"AgentsCount"=>{"Total"=>0},
"RelaysCount"=>{"Total"=>0},
"ScanResultsDatabase"=>{"Type"=>"MSAccess"}}]
.
Finished in 0.028869 seconds.
1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications
100% passed
34.64 tests/s, 34.64 assertions/s
$ ruby test.rb
Loaded suite test
Started
known keys: ["AgentsCount", "ComputersCount", "RelaysCount", "RemediationsCount", "ScanResultsDatabase", "ScansCount"]
hash data keys: ["AgentsCount", "ComputersCount", "RelaysCount", "RemediationsCount", "ScanResultsDatabase", "ScansCount"]
new keys: []
[{"_id"=>BSON::ObjectId('535976197f11ba278e000001'),
"ComputersCount"=>
{"Total"=>1, "Licensed"=>1, "ByOS"=>{"OS"=>{"Windows 7 x64"=>1}}},
"ScansCount"=>
{"Total"=>8,
"Scheduled"=>8,
"Agent"=>0,
"ByScanningProfile"=>{"Profile"=>{"Missing Patches"=>8}}},
"RemediationsCount"=>{"Total"=>1, "ByType"=>{"Type"=>{"9"=>1}}},
"AgentsCount"=>{"Total"=>0},
"RelaysCount"=>{"Total"=>0},
"ScanResultsDatabase"=>{"Type"=>"MSAccess"}}]
.
Finished in 0.015559 seconds.
1 tests, 1 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications
100% passed
64.27 tests/s, 64.27 assertions/s

How to construct the 2d structure in a dynamic fashion

I iterate through all cars and its supported attributes (many attributes per car) to create a structure like this, how do I do this in a dynamic fashion.
cars = {
"honda" => {'color' => 'blue', 'type' => 'sedan'}.
"nissan" => {'color' => 'yellow', 'type' => 'sports'}.
...
}
cars.each do |car|
car_attrs = ...
car_attrs.each do |attr|
??? How to construct the above structure
end
end
Your question is not very clear... But i guess this is what you want:
cars = {}
options = {}
options['color'] = 'blue'
...
cars['honda'] = options
Is that what you were looking for?
It sounds like you may be asking for a way to create a 2-dimensional hash without having to explicitly create each child hash. One way to accomplish that is by specifying the default object created for a hash key.
# When we create the cars hash, we tell it to create a new Hash
# for undefined keys
cars = Hash.new { |hash, key| hash[key] = Hash.new }
# We can then assign values two-levels deep as follows
cars["honda"]["color"] = "blue"
cars["honda"]["type"] = "sedan"
cars["nissan"]["color"] = "yellow"
cars["nissan"]["type"] = "sports"
# But be careful not to check for nil using the [] operator
# because a default hash is now created when using it
puts "Found a Toyota" if cars["toyota"]
# The correct way to check would be
puts "Really found a Toyota" if cars.has_key? "toyota"
Many client libraries assume that the [] operator returns a nil default, so make sure other code doesn't depend on that behavior before using this solution. Good luck!
Assuming you are using something similar to ActiveRecord (but easy to modify if you are not):
cars_info = Hash[cars.map { |car| [car.name, car.attributes] }

Resources