<DataSet xmlns="http://www.atcomp.cz/webservices">
<xs:schema xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" id="file_mame">...</xs:schema>
<diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
<alldata xmlns="">
<category diffgr:id="category1" msdata:rowOrder="0">
<category_code>P.../category_code>
<category_name>...</category_name>
<subcategory diffgr:id="subcategory1" msdata:rowOrder="0">
<category_code>...</category_code>
<subcategory_code>...</subcategory_code>
<subcategory_name>...</subcategory_name>
</subcategory>
....
How can I obtain all categories and subcategories data?
I am trying something like:
reader.xpath('//DataSet/diffgr:diffgram/alldata').each do |node|
But this gives me:
undefined method `xpath' for #<Nokogiri::XML::Reader:0x000001021d1750>
Nokogiri's Reader parser does not support XPath. Try using Nokogiri's in-memory Document parser instead.
On another note, to query xpath namespaces, you need to provide a namespace mapping, like this:
doc = Nokogiri::XML(my_document_string_or_io)
namespaces = {
'default' => 'http://www.atcomp.cz/webservices',
'diffgr' => 'urn:schemas-microsoft-com:xml-diffgram-v1'
}
doc.xpath('//default:DataSet/diffgr:diffgram/alldata', namespaces).each do |node|
# ...
end
Or you can remove the namespaces:
doc.remove_namespaces!
doc.xpath('//DataSet/diffgram/alldata').each { |node| }
Related
I'm opening a XML file with this content:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE relatos PUBLIC "-//SINCODH/DTD relatos 0.97" "relatos.dtd">
<relatos>
</relatos>
Now, I want to replace the DOCTYPE tag for a new dtd:
<!DOCTYPE relatos SYSTEM "test/dummy/public/midtd.dtd">
I'm trying with this, but seems first i need to remove dtd tag existing:
docnoko = Nokogiri::XML(doc)
docnoko.create_internal_subset("relatos", nil, "test/dummy/public/midtd.dtd")
Well, usually Nokogiri makes it really easy to replace nodes or delete them and add something else in, but this requires a bit of a work-around:
require 'nokogiri'
old_doc = Nokogiri::XML(<<EOT)
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE relatos PUBLIC "-//SINCODH/DTD relatos 0.97" "relatos.dtd">
<relatos>
<foo />
<bar />
</relatos>
EOT
Create a new document:
new_doc = Nokogiri::XML('<relatos/>')
Which looks like this:
new_doc.to_xml # => "<?xml version=\"1.0\"?>\n<relatos/>\n"
Then add the new DTD:
new_doc.create_internal_subset('relatos', nil, 'test/dummy/public/midtd.dtd')
Then append the nodes from the old document to the new one:
new_doc.at('relatos').children = old_doc.at('relatos').children
Resulting in:
new_doc.to_xml # => "<?xml version=\"1.0\"?>\n<!DOCTYPE relatos SYSTEM \"test/dummy/public/midtd.dtd\">\n<relatos>\n <foo/>\n <bar/>\n</relatos>\n"
Here's the code in one chunk:
require 'nokogiri'
old_doc = Nokogiri::XML(<<EOT)
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE relatos PUBLIC "-//SINCODH/DTD relatos 0.97" "relatos.dtd">
<relatos>
<foo />
<bar />
</relatos>
EOT
new_doc = Nokogiri::XML('<relatos/>')
new_doc.create_internal_subset('relatos', nil, 'test/dummy/public/midtd.dtd')
new_doc.at('relatos').children = old_doc.at('relatos').children
You might ask on the Nokogiri-Talk list or their IRC channel as the really smart people hang out there.
I'm new to nokogiri and am having trouble using xpath to access nested elements of an xml document with a specific xmlns.
Given the following code
#!/opt/chef/embedded/bin/ruby
require 'nokogiri'
doc = Nokogiri::XML.parse <<-XML
<?xml version="1.0" encoding="UTF-8" ?>
<domain xmlns="urn:jboss:domain:1.8">
<profiles>
<profile name="full">
<subsystem xmlns="urn:jboss:domain:datasources:1.2">
<datasources>
<datasource jndi-name="java:/Paulstestjndi" pool-name="pauls_ds" enabled="false">
<connection-url>jdbc:oracle:thin:#testhost1:80001paulstestinstance|jdbc:oracle:thin:#testhost2:80001paulstestinstance</connection-url>
</datasource>
</datasources>
</subsystem>
</profile>
</profiles>
</domain>
XML
datasources = doc.xpath('//datasources:datasource', 'datasources' => "urn:jboss:domain:datasources:1.2")
datasources.each do |datasource|
conn_url = datasource.xpath("connection-url")
puts "CLASS = #{conn_url.class}"
puts "No of Entries = #{conn_url.length}"
end
I am able to retrieve datasources using xpath but am unable to use xpath to access 'connection-url' for each datasource.
I have tried several xpath calls to achieve this the following are examples
conn_url = datasource.xpath("connection-url")
conn_url = datasource.xpath("//connection-url")
conn_url = datasource.xpath("//datasources:datasource/connection-url", 'datasources'=>"urn:jboss:domain:datasources:1.2")
But each seems to return an empty set of results.
What am I missing?
It’s a namespacing issue:
datasource.xpath(
'subsystem:connection-url',
'subsystem' => 'urn:jboss:domain:datasources:1.2')
#⇒ [#<... name="connection-url" namespace=...
I have a XML in following format
<Body xmlns:soap-env="http://schemas.xmlsoap.org/soap/envelope/" xmlns="http://schemas.xmlsoap.org/soap/envelope/">
<TransactionAcknowledgement xmlns="">
<TransactionId>HELLO </TransactionId>
<UserId>MC</UserId>
<SendingPartyType>SE</SendingPartyType>
</TransactionAcknowledgement>
</Body>
I want to user XQuery or XPath expression for it.
Now I want to remove only
xmlns:soap-env="http://schemas.xmlsoap.org/soap/envelope/"
namespace from xml.
Is there any way to achieve it.
Thanks
Try to use functx:change-element-ns-deep:
let $xml := <Body xmlns:soap-env="http://schemas.xmlsoap.org/soap/envelope/" xmlns="http://schemas.xmlsoap.org/soap/envelope/">
<TransactionAcknowledgement xmlns="">
<TransactionId>HELLO </TransactionId>
<UserId>MC</UserId>
<SendingPartyType>SE</SendingPartyType>
</TransactionAcknowledgement>
</Body>
return functx:change-element-ns-deep($xml, "http://schemas.xmlsoap.org/soap/envelope/", "")
But as said Dimitre Novatchev this function doesn't change namespace of the source xml, it creates a new XML.
I have an xml document like this:
<?xml version="1.0" encoding="UTF-8"?>
<foo:root xmlns:foo="http://abc.com#" xmlns:bar="http://def.com" xmlns:ex="http://ex.com">
<foo:element foo:attribute="attribute_value">
<bar:otherElement foo:otherAttribute="otherAttributeValue"/>
</foo:element>
</foo:root>
I need to add child elements to the element so that it looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<foo:root xmlns:foo="http://abc.com#" xmlns:bar="http://def.com" xmlns:ex="http://ex.com">
<foo:element foo:attribute="attribute_value">
<bar:otherElement foo:otherAttribute="otherAttributeValue"/>
<bar:otherElement foo:otherAttribute="newAttributeValue"/>
<ex:yetAnotherElement foo:otherAttribute="yetANewAttributeValue"/>
</foo:element>
</foo:root>
I can add elements in the correct location using the following:
require 'rubygems'
require 'nokogiri'
doc = Nokogiri::XML::Document.parse(File.open("myfile.xml"))
el = doc.at_xpath('//foo:element')
newEl = Nokogiri::XML::Node.new("otherElement", doc)
newEl["foo:otherAttribute"] = "newAttributeValue"
el.add_child(newEl)
newEl = Nokogiri::XML::Node.new("yetAnotherElement", doc)
newEl["foo:otherAttribute"] = "yetANewAttributeValue"
el.add_child(newEl)
However the prefix of the new elements is always "foo":
<foo:root xmlns:foo="http://abc.com#" xmlns:bar="http://def.com" xmlns:ex="http://ex.com">
<foo:element foo:attribute="attribute_value">
<bar:otherElement foo:otherAttribute="otherAttributeValue" />
<foo:otherElement foo:otherAttribute="newAttributeValue" />
<foo:yetAnotherElement foo:otherAttribute="yetANewAttributeValue" />
</foo:element>
</foo:root>
How can I set the prefix on the element name for these new child elements? Thanks,
Eoghan
(removed bit about defining namespace, orthogonal to question and fixed in edit)
just add a few lines to your code, and you get the result desired:
require 'rubygems'
require 'nokogiri'
doc = Nokogiri::XML::Document.parse(File.open("myfile.xml"))
el = doc.at_xpath('//foo:element')
newEl = Nokogiri::XML::Node.new("otherElement", doc)
newEl["foo:otherAttribute"] = "newAttributeValue"
# ADDITIONAL CODE
newEl.namespace = doc.root.namespace_definitions.find{|ns| ns.prefix=="bar"}
#
el.add_child(newEl)
newEl = Nokogiri::XML::Node.new("yetAnotherElement", doc)
newEl["foo:otherAttribute"] = "yetANewAttributeValue"
# ADDITIONAL CODE
newEl.namespace = doc.root.namespace_definitions.find{|ns| ns.prefix == "ex"}
#
el.add_child(newEl)
and the result:
<?xml version="1.0" encoding="UTF-8"?>
<foo:root xmlns:abc="http://abc.com#" xmlns:def="http://def.com" xmlns:ex="http://ex.com" xmlns:foo="http://foo.com" xmlns:bar="http://bar.com">
<foo:element foo:attribute="attribute_value">
<bar:otherElement foo:otherAttribute="otherAttributeValue"/>
<bar:otherElement foo:otherAttribute="newAttributeValue"/>
<ex:yetAnotherElement foo:otherAttribute="yetANewAttributeValue"/>
</foo:element>
</foo:root>
The namespace 'foo' is not defined.
See this for more details:
Nokogiri/Xpath namespace query
YQL Console Link
Query:
select * from html where url='http://www.cbs.com/shows/big_brother/video/' and xpath='//div[#id="cbs-video-metadata-wrapper"]/div[#class="cbs-video-share"]/a'
Returns:
<?xml version="1.0" encoding="UTF-8"?>
<query xmlns:yahoo="http://www.yahooapis.com/v1/base.rng"
yahoo:count="1" yahoo:created="2011-07-09T23:14:02Z" yahoo:lang="en-US">
<diagnostics>
<publiclyCallable>true</publiclyCallable>
<url execution-time="146" proxy="DEFAULT"><![CDATA[http://www.cbs.com/shows/big_brother/video/]]></url>
<user-time>163</user-time>
<service-time>146</service-time>
<build-version>19262</build-version>
</diagnostics>
<results>
<a class="twitter-share-button" href="http://twitter.com/share"/>
</results>
</query>
Should Return Something Similar To:
<results>
</results>
If I back out the query one level, it totally strips out the element, which I could also use to get the data I need.
We have a new html parser that recognizes custom attributes now.
Add compat="html5" to trigger the new parser.
e.g.:
select * from html where url = "http://mydomain.com" and compat="html5"