Nokogiri: controlling element prefix for new child elments - ruby

I have an xml document like this:
<?xml version="1.0" encoding="UTF-8"?>
<foo:root xmlns:foo="http://abc.com#" xmlns:bar="http://def.com" xmlns:ex="http://ex.com">
<foo:element foo:attribute="attribute_value">
<bar:otherElement foo:otherAttribute="otherAttributeValue"/>
</foo:element>
</foo:root>
I need to add child elements to the element so that it looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<foo:root xmlns:foo="http://abc.com#" xmlns:bar="http://def.com" xmlns:ex="http://ex.com">
<foo:element foo:attribute="attribute_value">
<bar:otherElement foo:otherAttribute="otherAttributeValue"/>
<bar:otherElement foo:otherAttribute="newAttributeValue"/>
<ex:yetAnotherElement foo:otherAttribute="yetANewAttributeValue"/>
</foo:element>
</foo:root>
I can add elements in the correct location using the following:
require 'rubygems'
require 'nokogiri'
doc = Nokogiri::XML::Document.parse(File.open("myfile.xml"))
el = doc.at_xpath('//foo:element')
newEl = Nokogiri::XML::Node.new("otherElement", doc)
newEl["foo:otherAttribute"] = "newAttributeValue"
el.add_child(newEl)
newEl = Nokogiri::XML::Node.new("yetAnotherElement", doc)
newEl["foo:otherAttribute"] = "yetANewAttributeValue"
el.add_child(newEl)
However the prefix of the new elements is always "foo":
<foo:root xmlns:foo="http://abc.com#" xmlns:bar="http://def.com" xmlns:ex="http://ex.com">
<foo:element foo:attribute="attribute_value">
<bar:otherElement foo:otherAttribute="otherAttributeValue" />
<foo:otherElement foo:otherAttribute="newAttributeValue" />
<foo:yetAnotherElement foo:otherAttribute="yetANewAttributeValue" />
</foo:element>
</foo:root>
How can I set the prefix on the element name for these new child elements? Thanks,
Eoghan

(removed bit about defining namespace, orthogonal to question and fixed in edit)
just add a few lines to your code, and you get the result desired:
require 'rubygems'
require 'nokogiri'
doc = Nokogiri::XML::Document.parse(File.open("myfile.xml"))
el = doc.at_xpath('//foo:element')
newEl = Nokogiri::XML::Node.new("otherElement", doc)
newEl["foo:otherAttribute"] = "newAttributeValue"
# ADDITIONAL CODE
newEl.namespace = doc.root.namespace_definitions.find{|ns| ns.prefix=="bar"}
#
el.add_child(newEl)
newEl = Nokogiri::XML::Node.new("yetAnotherElement", doc)
newEl["foo:otherAttribute"] = "yetANewAttributeValue"
# ADDITIONAL CODE
newEl.namespace = doc.root.namespace_definitions.find{|ns| ns.prefix == "ex"}
#
el.add_child(newEl)
and the result:
<?xml version="1.0" encoding="UTF-8"?>
<foo:root xmlns:abc="http://abc.com#" xmlns:def="http://def.com" xmlns:ex="http://ex.com" xmlns:foo="http://foo.com" xmlns:bar="http://bar.com">
<foo:element foo:attribute="attribute_value">
<bar:otherElement foo:otherAttribute="otherAttributeValue"/>
<bar:otherElement foo:otherAttribute="newAttributeValue"/>
<ex:yetAnotherElement foo:otherAttribute="yetANewAttributeValue"/>
</foo:element>
</foo:root>

The namespace 'foo' is not defined.
See this for more details:
Nokogiri/Xpath namespace query

Related

How replace DTD path Nokogiri?

I'm opening a XML file with this content:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE relatos PUBLIC "-//SINCODH/DTD relatos 0.97" "relatos.dtd">
<relatos>
</relatos>
Now, I want to replace the DOCTYPE tag for a new dtd:
<!DOCTYPE relatos SYSTEM "test/dummy/public/midtd.dtd">
I'm trying with this, but seems first i need to remove dtd tag existing:
docnoko = Nokogiri::XML(doc)
docnoko.create_internal_subset("relatos", nil, "test/dummy/public/midtd.dtd")
Well, usually Nokogiri makes it really easy to replace nodes or delete them and add something else in, but this requires a bit of a work-around:
require 'nokogiri'
old_doc = Nokogiri::XML(<<EOT)
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE relatos PUBLIC "-//SINCODH/DTD relatos 0.97" "relatos.dtd">
<relatos>
<foo />
<bar />
</relatos>
EOT
Create a new document:
new_doc = Nokogiri::XML('<relatos/>')
Which looks like this:
new_doc.to_xml # => "<?xml version=\"1.0\"?>\n<relatos/>\n"
Then add the new DTD:
new_doc.create_internal_subset('relatos', nil, 'test/dummy/public/midtd.dtd')
Then append the nodes from the old document to the new one:
new_doc.at('relatos').children = old_doc.at('relatos').children
Resulting in:
new_doc.to_xml # => "<?xml version=\"1.0\"?>\n<!DOCTYPE relatos SYSTEM \"test/dummy/public/midtd.dtd\">\n<relatos>\n <foo/>\n <bar/>\n</relatos>\n"
Here's the code in one chunk:
require 'nokogiri'
old_doc = Nokogiri::XML(<<EOT)
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE relatos PUBLIC "-//SINCODH/DTD relatos 0.97" "relatos.dtd">
<relatos>
<foo />
<bar />
</relatos>
EOT
new_doc = Nokogiri::XML('<relatos/>')
new_doc.create_internal_subset('relatos', nil, 'test/dummy/public/midtd.dtd')
new_doc.at('relatos').children = old_doc.at('relatos').children
You might ask on the Nokogiri-Talk list or their IRC channel as the really smart people hang out there.

Using nokogiri xpath to access nested elements within an xmlns

I'm new to nokogiri and am having trouble using xpath to access nested elements of an xml document with a specific xmlns.
Given the following code
#!/opt/chef/embedded/bin/ruby
require 'nokogiri'
doc = Nokogiri::XML.parse <<-XML
<?xml version="1.0" encoding="UTF-8" ?>
<domain xmlns="urn:jboss:domain:1.8">
<profiles>
<profile name="full">
<subsystem xmlns="urn:jboss:domain:datasources:1.2">
<datasources>
<datasource jndi-name="java:/Paulstestjndi" pool-name="pauls_ds" enabled="false">
<connection-url>jdbc:oracle:thin:#testhost1:80001paulstestinstance|jdbc:oracle:thin:#testhost2:80001paulstestinstance</connection-url>
</datasource>
</datasources>
</subsystem>
</profile>
</profiles>
</domain>
XML
datasources = doc.xpath('//datasources:datasource', 'datasources' => "urn:jboss:domain:datasources:1.2")
datasources.each do |datasource|
conn_url = datasource.xpath("connection-url")
puts "CLASS = #{conn_url.class}"
puts "No of Entries = #{conn_url.length}"
end
I am able to retrieve datasources using xpath but am unable to use xpath to access 'connection-url' for each datasource.
I have tried several xpath calls to achieve this the following are examples
conn_url = datasource.xpath("connection-url")
conn_url = datasource.xpath("//connection-url")
conn_url = datasource.xpath("//datasources:datasource/connection-url", 'datasources'=>"urn:jboss:domain:datasources:1.2")
But each seems to return an empty set of results.
What am I missing?
It’s a namespacing issue:
datasource.xpath(
'subsystem:connection-url',
'subsystem' => 'urn:jboss:domain:datasources:1.2')
#⇒ [#<... name="connection-url" namespace=...

Ruby XML Parsing

I have a sample XML document like
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<SearchSecretsResponse xmlns="urn:thesecretserver.com">
<SearchSecretsResult>
<Errors/>
<SecretSummaries>
<SecretSummary>
<SecretId>86</SecretId>
<SecretName>hostName\root</SecretName>
<SecretTypeName>Unix Root Account (SSH)</SecretTypeName>
</SecretSummary>
</SecretSummaries>
</SearchSecretsResult>
</SearchSecretsResponse>
</soap:Body>
</soap:Envelope>
I am trying to parse it using Nokogiri. My code is
doc = Nokogiri::XML.parse(xml)
puts doc.xpath('//SecretSummary')
But this doesn't print anything. What am I doing wrong ?
You'll need to alias the namespace.
Nokogiri::XML(xml).xpath('//foo:SecretSummary', 'foo' => 'urn:thesecretserver.com')
You could also remove the namespaces
doc.remove_namespaces!

Parsing an XML file with Nokogiri?

<DataSet xmlns="http://www.atcomp.cz/webservices">
<xs:schema xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" id="file_mame">...</xs:schema>
<diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
<alldata xmlns="">
<category diffgr:id="category1" msdata:rowOrder="0">
<category_code>P.../category_code>
<category_name>...</category_name>
<subcategory diffgr:id="subcategory1" msdata:rowOrder="0">
<category_code>...</category_code>
<subcategory_code>...</subcategory_code>
<subcategory_name>...</subcategory_name>
</subcategory>
....
How can I obtain all categories and subcategories data?
I am trying something like:
reader.xpath('//DataSet/diffgr:diffgram/alldata').each do |node|
But this gives me:
undefined method `xpath' for #<Nokogiri::XML::Reader:0x000001021d1750>
Nokogiri's Reader parser does not support XPath. Try using Nokogiri's in-memory Document parser instead.
On another note, to query xpath namespaces, you need to provide a namespace mapping, like this:
doc = Nokogiri::XML(my_document_string_or_io)
namespaces = {
'default' => 'http://www.atcomp.cz/webservices',
'diffgr' => 'urn:schemas-microsoft-com:xml-diffgram-v1'
}
doc.xpath('//default:DataSet/diffgr:diffgram/alldata', namespaces).each do |node|
# ...
end
Or you can remove the namespaces:
doc.remove_namespaces!
doc.xpath('//DataSet/diffgram/alldata').each { |node| }

Add comment nodes outside root with ruby-libxml

I am writing an xml exporter in ruby and I am using libxml package for it.
I want to write some comment nodes outside the root element
<?xml version="1.0" encoding="UTF-8"?>
<!-- comment -->
<root>
<childnode />
</root>
How do I accomplish export to above format?
Sample ruby code to generate the above (without accounting for comment node)
doc = XML::Document.new()
rootNode = XML::Node.new('root')
doc.root = rootNode
childNode = XML::Node.new('childnode')
childnode << rootNode
ended up editing the xml string manually to add the comments outside the root node (for both libxml and nokogiri
<?xml version="1.0" encoding="UTF-8" ?>
<List type = "" =”00:75:00” =”00:00:05”>
</List>
Yes
<?xml version="1.0" encoding="UTF-8" ?>
<List type = "update" >
</List>

Resources