How replace DTD path Nokogiri? - ruby

I'm opening a XML file with this content:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE relatos PUBLIC "-//SINCODH/DTD relatos 0.97" "relatos.dtd">
<relatos>
</relatos>
Now, I want to replace the DOCTYPE tag for a new dtd:
<!DOCTYPE relatos SYSTEM "test/dummy/public/midtd.dtd">
I'm trying with this, but seems first i need to remove dtd tag existing:
docnoko = Nokogiri::XML(doc)
docnoko.create_internal_subset("relatos", nil, "test/dummy/public/midtd.dtd")

Well, usually Nokogiri makes it really easy to replace nodes or delete them and add something else in, but this requires a bit of a work-around:
require 'nokogiri'
old_doc = Nokogiri::XML(<<EOT)
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE relatos PUBLIC "-//SINCODH/DTD relatos 0.97" "relatos.dtd">
<relatos>
<foo />
<bar />
</relatos>
EOT
Create a new document:
new_doc = Nokogiri::XML('<relatos/>')
Which looks like this:
new_doc.to_xml # => "<?xml version=\"1.0\"?>\n<relatos/>\n"
Then add the new DTD:
new_doc.create_internal_subset('relatos', nil, 'test/dummy/public/midtd.dtd')
Then append the nodes from the old document to the new one:
new_doc.at('relatos').children = old_doc.at('relatos').children
Resulting in:
new_doc.to_xml # => "<?xml version=\"1.0\"?>\n<!DOCTYPE relatos SYSTEM \"test/dummy/public/midtd.dtd\">\n<relatos>\n <foo/>\n <bar/>\n</relatos>\n"
Here's the code in one chunk:
require 'nokogiri'
old_doc = Nokogiri::XML(<<EOT)
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE relatos PUBLIC "-//SINCODH/DTD relatos 0.97" "relatos.dtd">
<relatos>
<foo />
<bar />
</relatos>
EOT
new_doc = Nokogiri::XML('<relatos/>')
new_doc.create_internal_subset('relatos', nil, 'test/dummy/public/midtd.dtd')
new_doc.at('relatos').children = old_doc.at('relatos').children
You might ask on the Nokogiri-Talk list or their IRC channel as the really smart people hang out there.

Related

How to edit vsixmanifest using groovy without loosing content?

My question may sound stupid but I am struggling to achieve my result. I would like to edit and save version tage of vsixmanifest file without loosing any content. This article almost solved my purpose but it removes some of the tags.
<?xml version="1.0" encoding="utf-8"?>
<Vsix xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Version="1.0.0" xmlns="http://schemas.microsoft.com/developer/vsx-schema/2010">
<Identifier Id="Visual Test">
<Name>Visual Gallery Test</Name>
<Author>Visual Studio Demo</Author>
<Version>XXXXX</Version>
<Description xml:space="preserve">Visual Studio Gallery Demo</Description>
<Locale>1033</Locale>
<AllUsers>true</AllUsers>
<InstalledByMsi>false</InstalledByMsi>
<Icon>Resources/Icon.png</Icon>
<PreviewImage>Resources/Preview.png</PreviewImage>
<SupportedProducts>
<IsolatedShell Version="7.0">Visual Studio</IsolatedShell>
</SupportedProducts>
<SupportedFrameworkRuntimeEdition MinVersion="4.6" MaxVersion="4.9" />
</Identifier>
<Content>
<VsPackage>XX.pkgdef</VsPackage>
</Content>
</Vsix>
Here is my gradle script
task updateExtensionManifest{
def vsixmanifestFile = "build/source.extension.vsixmanifest"
def vsixmanifest = new XmlParser().parse(vsixmanifestFile)
vsixmanifest.Identifier[0].Version[0].value="YYYYY"
def nodePrinter = new XmlNodePrinter(new PrintWriter(new FileWriter(vsixmanifestFile)))
nodePrinter.preserveWhitespace = true
nodePrinter.expandEmptyElements = true
nodePrinter.print(vsixmanifest)
}
When I execute the script it removes some of the tags defines manifest file, this is how it looks after task is getting executed:
<Vsix xmlns="http://schemas.microsoft.com/developer/vsx-schema/2010" Version="1.0.0">
<Identifier Id="Visual Test">
<Name>Visual Gallery Test</Name>
<Author>Visual Studio Demo</Author>
<Version>YYYYY</Version>
<Description xml:space="preserve" xmlns:xml="http://www.w3.org/XML/1998/namespace">Visual Studio Gallery Demo</Description>
<Locale>1033</Locale>
<AllUsers>true</AllUsers>
<InstalledByMsi>false</InstalledByMsi>
<Icon>Resources/Icon.png</Icon>
<PreviewImage>Resources/Preview.png</PreviewImage>
<SupportedProducts>
<IsolatedShell Version="7.0">Visual Studio</IsolatedShell>
</SupportedProducts>
<SupportedFrameworkRuntimeEdition MinVersion="4.6" MaxVersion="4.9"></SupportedFrameworkRuntimeEdition>
</Identifier>
<Content>
<VsPackage>XX.pkgdef</VsPackage>
</Content>
</Vsix>
Some unwanted edits:
Line1 removed: <?xml version="1.0" encoding="utf-8"?>
Line2 modified to <Vsix xmlns="http://schemas.microsoft.com/developer/vsx-schema/2010" Version="1.0.0">
Tag "Description" got edited too..
How could I avoid this edits? I just expected the version to get modified from XXXXX to YYYYY without changing any other content through my gradle build script.
That is because XmlNodePrinter.
If xml markup declaration is needed, use XmlUtil.serialize().
def vsixmanifest = new XmlSlurper().parse(vsixmanifestFile)
vsixmanifest.Identifier[0].Version.replaceBody ( "YYYYY" )
println groovy.xml.XmlUtil.serialize(vsixmanifest)
You can quickly try the demo

Ruby XML Parsing

I have a sample XML document like
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<SearchSecretsResponse xmlns="urn:thesecretserver.com">
<SearchSecretsResult>
<Errors/>
<SecretSummaries>
<SecretSummary>
<SecretId>86</SecretId>
<SecretName>hostName\root</SecretName>
<SecretTypeName>Unix Root Account (SSH)</SecretTypeName>
</SecretSummary>
</SecretSummaries>
</SearchSecretsResult>
</SearchSecretsResponse>
</soap:Body>
</soap:Envelope>
I am trying to parse it using Nokogiri. My code is
doc = Nokogiri::XML.parse(xml)
puts doc.xpath('//SecretSummary')
But this doesn't print anything. What am I doing wrong ?
You'll need to alias the namespace.
Nokogiri::XML(xml).xpath('//foo:SecretSummary', 'foo' => 'urn:thesecretserver.com')
You could also remove the namespaces
doc.remove_namespaces!

Parsing an XML file with Nokogiri?

<DataSet xmlns="http://www.atcomp.cz/webservices">
<xs:schema xmlns="" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" id="file_mame">...</xs:schema>
<diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
<alldata xmlns="">
<category diffgr:id="category1" msdata:rowOrder="0">
<category_code>P.../category_code>
<category_name>...</category_name>
<subcategory diffgr:id="subcategory1" msdata:rowOrder="0">
<category_code>...</category_code>
<subcategory_code>...</subcategory_code>
<subcategory_name>...</subcategory_name>
</subcategory>
....
How can I obtain all categories and subcategories data?
I am trying something like:
reader.xpath('//DataSet/diffgr:diffgram/alldata').each do |node|
But this gives me:
undefined method `xpath' for #<Nokogiri::XML::Reader:0x000001021d1750>
Nokogiri's Reader parser does not support XPath. Try using Nokogiri's in-memory Document parser instead.
On another note, to query xpath namespaces, you need to provide a namespace mapping, like this:
doc = Nokogiri::XML(my_document_string_or_io)
namespaces = {
'default' => 'http://www.atcomp.cz/webservices',
'diffgr' => 'urn:schemas-microsoft-com:xml-diffgram-v1'
}
doc.xpath('//default:DataSet/diffgr:diffgram/alldata', namespaces).each do |node|
# ...
end
Or you can remove the namespaces:
doc.remove_namespaces!
doc.xpath('//DataSet/diffgram/alldata').each { |node| }

Nokogiri: controlling element prefix for new child elments

I have an xml document like this:
<?xml version="1.0" encoding="UTF-8"?>
<foo:root xmlns:foo="http://abc.com#" xmlns:bar="http://def.com" xmlns:ex="http://ex.com">
<foo:element foo:attribute="attribute_value">
<bar:otherElement foo:otherAttribute="otherAttributeValue"/>
</foo:element>
</foo:root>
I need to add child elements to the element so that it looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<foo:root xmlns:foo="http://abc.com#" xmlns:bar="http://def.com" xmlns:ex="http://ex.com">
<foo:element foo:attribute="attribute_value">
<bar:otherElement foo:otherAttribute="otherAttributeValue"/>
<bar:otherElement foo:otherAttribute="newAttributeValue"/>
<ex:yetAnotherElement foo:otherAttribute="yetANewAttributeValue"/>
</foo:element>
</foo:root>
I can add elements in the correct location using the following:
require 'rubygems'
require 'nokogiri'
doc = Nokogiri::XML::Document.parse(File.open("myfile.xml"))
el = doc.at_xpath('//foo:element')
newEl = Nokogiri::XML::Node.new("otherElement", doc)
newEl["foo:otherAttribute"] = "newAttributeValue"
el.add_child(newEl)
newEl = Nokogiri::XML::Node.new("yetAnotherElement", doc)
newEl["foo:otherAttribute"] = "yetANewAttributeValue"
el.add_child(newEl)
However the prefix of the new elements is always "foo":
<foo:root xmlns:foo="http://abc.com#" xmlns:bar="http://def.com" xmlns:ex="http://ex.com">
<foo:element foo:attribute="attribute_value">
<bar:otherElement foo:otherAttribute="otherAttributeValue" />
<foo:otherElement foo:otherAttribute="newAttributeValue" />
<foo:yetAnotherElement foo:otherAttribute="yetANewAttributeValue" />
</foo:element>
</foo:root>
How can I set the prefix on the element name for these new child elements? Thanks,
Eoghan
(removed bit about defining namespace, orthogonal to question and fixed in edit)
just add a few lines to your code, and you get the result desired:
require 'rubygems'
require 'nokogiri'
doc = Nokogiri::XML::Document.parse(File.open("myfile.xml"))
el = doc.at_xpath('//foo:element')
newEl = Nokogiri::XML::Node.new("otherElement", doc)
newEl["foo:otherAttribute"] = "newAttributeValue"
# ADDITIONAL CODE
newEl.namespace = doc.root.namespace_definitions.find{|ns| ns.prefix=="bar"}
#
el.add_child(newEl)
newEl = Nokogiri::XML::Node.new("yetAnotherElement", doc)
newEl["foo:otherAttribute"] = "yetANewAttributeValue"
# ADDITIONAL CODE
newEl.namespace = doc.root.namespace_definitions.find{|ns| ns.prefix == "ex"}
#
el.add_child(newEl)
and the result:
<?xml version="1.0" encoding="UTF-8"?>
<foo:root xmlns:abc="http://abc.com#" xmlns:def="http://def.com" xmlns:ex="http://ex.com" xmlns:foo="http://foo.com" xmlns:bar="http://bar.com">
<foo:element foo:attribute="attribute_value">
<bar:otherElement foo:otherAttribute="otherAttributeValue"/>
<bar:otherElement foo:otherAttribute="newAttributeValue"/>
<ex:yetAnotherElement foo:otherAttribute="yetANewAttributeValue"/>
</foo:element>
</foo:root>
The namespace 'foo' is not defined.
See this for more details:
Nokogiri/Xpath namespace query

Add comment nodes outside root with ruby-libxml

I am writing an xml exporter in ruby and I am using libxml package for it.
I want to write some comment nodes outside the root element
<?xml version="1.0" encoding="UTF-8"?>
<!-- comment -->
<root>
<childnode />
</root>
How do I accomplish export to above format?
Sample ruby code to generate the above (without accounting for comment node)
doc = XML::Document.new()
rootNode = XML::Node.new('root')
doc.root = rootNode
childNode = XML::Node.new('childnode')
childnode << rootNode
ended up editing the xml string manually to add the comments outside the root node (for both libxml and nokogiri
<?xml version="1.0" encoding="UTF-8" ?>
<List type = "" =”00:75:00” =”00:00:05”>
</List>
Yes
<?xml version="1.0" encoding="UTF-8" ?>
<List type = "update" >
</List>

Resources