How to get values in XML data using Nokogiri? - ruby

I'm using Nokogiri to parse XML data that I'm getting from the roar engine after I create a user. The XML looks like below:
<roar tick="135098427907">
<facebook>
<create_oauth status="ok">
<auth_token>14802206136746256007</auth_token>
<player_id>8957881063899628798</player_id>
</create_oauth>
</facebook>
</roar>
I'm totally new to Nokogiri. How do I get the value of status, the auth_token and player_id?

str = "<roar ......"
doc = Nokogiri.XML(str)
puts doc.xpath('//create_oauth/#status') # => ok
puts doc.xpath('//auth_token').text # => 148....
# player_id is the same as auth_token
And it is a great idea to learn you some good xpath from w3schools.

How about this
h1 = Nokogiri::XML.parse %{
<roar tick="135098427907">
<facebook>
<create_oauth status="ok">
<auth_token>14802206136746256007</auth_token>
<player_id>8957881063899628798</player_id>
</create_oauth>
</facebook>
</roar>
}
h1.xpath("//facebook/create_oauth/auth_token").text()
h1.xpath("//facebook/create_oauth/player_id").text()

You can use Nori gem. Its a xml to hash converter and in ruby its so much convenient to access hashes
require 'nori'
Nori.parser = :nokogiri
xml = "<roar tick='135098427907'>
<facebook>
<create_oauth status='ok'>
<auth_token>14802206136746256007</auth_token>
<player_id>8957881063899628798</player_id>
</create_oauth>
</facebook>
</roar>"
hash = Nori.parse(xml)
create_oauth = hash["roar"]["facebook"]["create_oauth"]
puts create_oauth["auth_token"] # 14802206136746256007
puts create_oauth["#status"] # ok
puts create_oauth["player_id"] # 8957881063899628798

Related

Get nested XML tags using Nokigiri in Ruby

I use this Ruby code in order to get XML tag value:
def value_from_xml_for(xml, tag)
xml_body = Nokogiri::XML(xml)
hash = Hash.from_xml(xml_body.to_s)
hash.dig("payment_transaction", tag)
end
I wan to get the value for country:
<payment_transaction>
<card_holder>Automation Example</card_holder>
<billing_address>
<country>DE</country>
</billing_address>
</payment_transaction>
Currently I get nil. How I can extent the code to work with inner XML tags?
Here's an example how to parse XML in nokogiri which is all your method is trying to do:
xml_str = <<EOF
<payment_transaction>
<card_holder>Automation Example</card_holder>
<billing_address>
<country>DE</country>
</billing_address>
</payment_transaction>
doc = Nokogiri::XML(xml_str)
doc.at_xpath('//country').to_s
So just use nokogiri to parse the xml as that is what it's good at, you don't need hash methods at all. Your method should be like this:
def value_from_xml_for(xml, tag)
tag = "//" + tag # assuming you pass 'country' as tag
xml_body = Nokogiri::XML(xml)
xml_body.at_xpath(tag)
end
if you want to get just the text of the tag
def value_from_xml_for(xml, tag)
tag = "//" + tag # assuming you pass 'country' as tag
xml_body = Nokogiri::XML(xml)
tag = xml_body.at_xpath(tag)
tag.text if tag
end

Create a Ruby Hash out of an xml string with the 'ox' gem

I am currently trying to create a hash out of an xml documen, with the help of the ox gem
Input xml:
<?xml version="1.0"?>
<expense>
<payee>starbucks</payee>
<amount>5.75</amount>
<date>2017-06-10</date>
</expense>
with the following ruby/ox code:
doc = Ox.parse(xml)
plist = doc.root.nodes
I get the following output:
=> [#<Ox::Element:0x00007f80d985a668 #value="payee", #attributes={}, #nodes=["starbucks"]>, #<Ox::Element:0x00007f80d9839198 #value="amount", #attributes={}, #nodes=["5.75"]>, #<Ox::Element:0x00007f80d9028788 #value="date", #attributes={}, #nodes=["2017-06-10"]>]
The output I want is a hash in the format:
{'payee' => 'Starbucks',
'amount' => 5.75,
'date' => '2017-06-10'}
to save in my sqllite database. How can I transform the objects array into a hash like above.
Any help is highly appreciated.
The docs suggest you can use the following:
require 'ox'
xml = %{
<top name="sample">
<middle name="second">
<bottom name="third">Rock bottom</bottom>
</middle>
</top>
}
puts Ox.load(xml, mode: :hash)
puts Ox.load(xml, mode: :hash_no_attrs)
#{:top=>[{:name=>"sample"}, {:middle=>[{:name=>"second"}, {:bottom=>[{:name=>"third"}, "Rock bottom"]}]}]}
#{:top=>{:middle=>{:bottom=>"Rock bottom"}}}
I'm not sure that's exactly what you're looking for though.
Otherwise, it really depends on the methods available on the Ox::Element instances in the array.
From the docs, it looks like there are two handy methods here: you can use [] and text.
Therefore, I'd use reduce to coerce the array into the hash format you're looking for, using something like the following:
ox_nodes = [#<Ox::Element:0x00007f80d985a668 #value="payee", #attributes={}, #nodes=["starbucks"]>, #<Ox::Element:0x00007f80d9839198 #value="amount", #attributes={}, #nodes=["5.75"]>, #<Ox::Element:0x00007f80d9028788 #value="date", #attributes={}, #nodes=["2017-06-10"]>]
ox_nodes.reduce({}) do |hash, node|
hash[node['#value']] = node.text
hash
end
I'm not sure whether node['#value'] will work, so you might need to experiment with that - otherwise perhaps node.instance_variable_get('#value') would do it.
node.text does the following, which sounds about right:
Returns the first String in the elements nodes array or nil if there is no String node.
N.B. I prefer to tidy the reduce block a little using tap, something like the following:
ox_nodes.reduce({}) do |hash, node|
hash.tap { |h| h[node['#value']] = node.text }
end
Hope that helps - let me know how you get on!
I found the answer to the question in my last comment by myself:
def create_xml(expense)
Ox.default_options=({:with_xml => false})
doc = Ox::Document.new(:version => '1.0')
expense.each do |key, value|
e = Ox::Element.new(key)
e << value
doc << e
end
Ox.dump(doc)
end
The next question would be how can i transform the value of the amount key from a string to an integer befopre saving it to the database

converting nokogiri xml node into ruby hash

I have an xml like this
<parentNode>
<amount>12.0</amount><authIdCode>999999</ authIdCode><currency>USD</currency>
</parentNode>
How can I get all nodes inside the ParentNode to a hash something like below?
{amount: "12", authIdCode: "999999", currency: "USD"}
Yes I could search for individual keys using nokogiri. But is it possible to get all keys and values inside the ParentNode dynamically and turn it into a hash?
Thank you.
Note: Hash.from_xml wont work as am not using rails
Using Hash[]:
Hash[doc.search('parentNode/*').map{|n| [n.name, n.text]}]
#=> {"amount"=>"12.0", "authIdCode"=>"999999", "currency"=>"USD"}
Here is a working sample:
require 'nokogiri'
xml = <<-EOS
<parentNode>
<amount>12.0</amount>
<authIdCode>999999</authIdCode>
<currency>USD</currency>
</ parentNode>
EOS
document = Nokogiri::XML(xml)
hash = document.xpath("//parentNode/*").each_with_object({}) do |node, hash|
hash[node.name] = node.text
end
p hash # => {"amount"=>"12.0", "authIdCode"=>"999999", "currency"=>"USD"}
It finds all the children of parentNode, uses the childs name as key, its text content as value.

converting from xml name-values into simple hash

I don't know what name this goes by and that's been complicating my search.
My data file OX.session.xml is in the (old?) form
<?xml version="1.0" encoding="utf-8"?>
<CAppLogin xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://oxbranch.optionsxpress.com">
<SessionID>FE5E27A056944FBFBEF047F2B99E0BF6</SessionID>
<AccountNum>8228-5500</AccountNum>
<AccountID>967454</AccountID>
</CAppLogin>
What is that XML data format called exactly?
Anyway, all I want is to end up with one hash in my Ruby code like so:
CAppLogin = { :SessionID => "FE5E27A056944FBFBEF047F2B99E0BF6", :AccountNum => "8228-5500", etc. } # Doesn't have to be called CAppLogin as in the file, may be fixed
What might be shortest, most built-in Ruby way to automate that hash read, in a way I can update the SessionID value and store it easily back into the file for later program runs?
I've played around with YAML, REXML but would rather not yet print my (bad) example trials.
There are a few libraries you can use in Ruby to do this.
Ruby toolbox has some good coverage of a few of them:
https://www.ruby-toolbox.com/categories/xml_mapping
I use XMLSimple, just require the gem then load in your xml file using xml_in:
require 'xmlsimple'
hash = XmlSimple.xml_in('session.xml')
If you're in a Rails environment, you can just use Active Support:
require 'active_support'
session = Hash.from_xml('session.xml')
Using Nokogiri to parse the XML with namespaces:
require 'nokogiri'
dom = Nokogiri::XML(File.read('OX.session.xml'))
node = dom.xpath('ox:CAppLogin',
'ox' => "http://oxbranch.optionsxpress.com").first
hash = node.element_children.each_with_object(Hash.new) do |e, h|
h[e.name.to_sym] = e.content
end
puts hash.inspect
# {:SessionID=>"FE5E27A056944FBFBEF047F2B99E0BF6",
# :AccountNum=>"8228-5500", :AccountID=>"967454"}
If you know that the CAppLogin is the root element, you can simplify a bit:
require 'nokogiri'
dom = Nokogiri::XML(File.read('OX.session.xml'))
hash = dom.root.element_children.each_with_object(Hash.new) do |e, h|
h[e.name.to_sym] = e.content
end
puts hash.inspect
# {:SessionID=>"FE5E27A056944FBFBEF047F2B99E0BF6",
# :AccountNum=>"8228-5500", :AccountID=>"967454"}

Optimizing Ruby RSS

I'm writing a very simple Ruby script to parse tweets out of a twitter RSS feed. Here's the code I have:
require 'rss'
#rss = RSS::Parser.parse('statuses.xml', false)
outputfile = open("output.txt", "w")
#rss.items.each do |i|
pubdate = i.published.to_s
if pubdate.include? '2011-05'
tweet = i.title.to_s
tweet = tweet.gsub(/<title>SlyFlourish: /, "")
tweet = tweet.gsub(/<\/title>/, "\n\n")
outputfile << tweet
end
end
I think I'm missing something about dealing with the objects coming out of the RSS parser. Can someone tell me how I can better pull out the title and date entries from the object returned by the parser?
Is there a reason you chose RSS? Parsing XML is expensive.
I'd consider using JSON instead.
There's also a twitter Ruby gem that makes this really easy:
require "twitter"
Twitter.user_timeline("gavin_morrice").each do |tweet|
puts tweet.text
puts tweet.created_at
end

Resources