I have the following code for generating a xml from a rabl template:
obj = OpenStruct.new
obj.categories = [{node: ["Foo","Bar"]},{node: ["Test1","Test2"]}]
Rabl::Renderer.xml(obj, 'adapter_xml')
and this is the rabl template adapter_xml.rabl
object #obj => :root
attributes :categories
which generates this XML:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<categories>
<category>
<node>
<node>Foo</node>
<node>Bar</node>
</node>
</category>
<category>
<node>
<node>Test1</node>
<node>Test2</node>
</node>
</category>
</categories>
</root>
But what I want to achieve is the following format, without the extra <node> tags:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<categories>
<category>
<node>Foo</node>
<node>Bar</node>
</category>
<category>
<node>Test1</node>
<node>Test2</node>
</category>
</categories>
</root>
Is there any way to do this with rabl? Or do I have to modify the ruby code mentioned first?
Related
I have XML files that I need to validate against XSDs that are nested inside a <wsdl:types></wsdl:types> in a WSDL file retrieved from a web service.
Inside the <wsdl:types></wsdl:types> there are several <xs:schema>s. I am using the ruby gem nokogiri to load the XML files and validate them against said XSDs, however, I am getting following error when I run the program:
Element '{http://schemas.xmlsoap.org/soap/envelope/}Envelope': No
matching global declaration available for the validation root.
So far I have extracted out the <xs:schema>s (all 4 of them) and copied them into a schema.xsd file.
Code:
require 'rubygems'
require 'nokogiri'
def validate(document_path, schema_path)
schema = Nokogiri::XML::Schema(File.read(schema_path))
document = Nokogiri::XML(File.read(document_path))
schema.validate(document)
end
validate('data.xml', 'schema.xsd').each do |error|
puts error.message
end
So basically my schema.xsd has multiple <xs:schema>s in there, which I do not think in and of itself is a problem because nokogiri didn't throw errors when I instantiated the schema object.
schema.xsd
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:tns="http://schemas.microsoft.com/2003/10/Serialization/" attributeFormDefault="qualified" elementFormDefault="qualified" targetNamespace="http://schemas.microsoft.com/2003/10/Serialization/">
<xs:element name="anyType" nillable="true" type="xs:anyType"/>
<xs:element name="anyURI" nillable="true" type="xs:anyURI"/>
<!-- data in here -->
</xs:schema>
<!-- three more xs:schema tags removed for brevity -->
data.xml
<?xml version='1.0' ?>
<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/">
<env:Header />
<env:Body>
<CreatePerson xmlns="https://person.example.com/">
<oMessageType xmlns:epa="http://schemas.datacontract.org/2004/07/whatever" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:array="http://schemas.microsoft.com/2003/10/Serialization/Arrays">
<person:bodyField>
<!-- data in here -->
</person:bodyField>
<!-- more data in here -->
</oMessageType>
</CreatePerson>
</env:Body>
</env:Envelope>
Yes, WSDL not the XSD scheam so you need to extract the schema partial manually or automatic by programming.
Here is the sample code, you need to refactor it and using in your codes.
str = <<EOF
<definitions xmlns="http://schemas.xmlsoap.org/wsdl/" xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/" xmlns:tns="http://www.examples.com/wsdl/HelloService.wsdl" xmlns:xsd="http://www.w3.org/2001/XMLSchema" name="HelloService" targetNamespace="http://www.examples.com/wsdl/HelloService.wsdl">
<types>
<schema xmlns="http://www.w3.org/2001/XMLSchema" xmlns:tns="http://schemas.microsoft.com/2003/10/Serialization/" attributeFormDefault="qualified" elementFormDefault="qualified" targetNamespace="http://schemas.microsoft.com/2003/10/Serialization/">
<element name="anyType" nillable="true" type="anyType"/>
<element name="anyURI" nillable="true" type="anyURI"/>
<!-- data in here -->
</schema>
</types>
<message name="SayHelloRequest">
<part name="firstName" type="xsd:string"/>
</message>
<message name="SayHelloResponse">
<part name="greeting" type="xsd:string"/>
</message>
<portType name="Hello_PortType">
<operation name="sayHello">
<input message="tns:SayHelloRequest"/>
<output message="tns:SayHelloResponse"/>
</operation>
</portType>
<binding name="Hello_Binding" type="tns:Hello_PortType">
<soap:binding style="rpc" transport="http://schemas.xmlsoap.org/soap/http"/>
<operation name="sayHello">
<soap:operation soapAction="sayHello"/>
<input>
<soap:body use="encoded" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" namespace="urn:examples:helloservice"/>
</input>
<output>
<soap:body use="encoded" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" namespace="urn:examples:helloservice"/>
</output>
</operation>
</binding>
<service name="Hello_Service">
<documentation>WSDL File for HelloService</documentation>
<port name="Hello_Port" binding="tns:Hello_Binding">
<soap:address location="http://www.examples.com/SayHello/"/>
</port>
</service>
</definitions>
EOF
require 'rubygems'
require 'nokogiri'
doc = Nokogiri::XML(str)
doc.root.children.each do |child|
if child.node_name == 'types'
types = child
# p types.inner_html
xsd_doc = Nokogiri::XML(types.inner_html)
# p xsd_doc.root
xsd = Nokogiri::XML::Schema.from_document xsd_doc.root
end
end
I receive a large XML file and often the XML file do not validate to schema file.
Instead of droping the whole xml file I would like to remove the "invalid" content and save the rest of the XML file.
I'm using xmllint to validate the xml by this command:
xmllint -schema testSchedule.xsd testXML.xml
The XSD file (in this example named testSchedule.xsd):
<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" targetNamespace="http://www.testing.dk" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="MasterData">
<xs:complexType>
<xs:sequence>
<xs:element name="Items">
<xs:complexType>
<xs:sequence>
<xs:element name="Item" maxOccurs="unbounded" minOccurs="0">
<xs:complexType>
<xs:sequence>
<xs:element type="xs:integer" name="Id" minOccurs="1"/>
<xs:element type="xs:integer" name="Width" minOccurs="1"/>
<xs:element type="xs:integer" name="Height" minOccurs="0"/>
<xs:element type="xs:string" name="Remark"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
And the XML file (In this example named testXML.xml):
<?xml version="1.0" encoding="ISO-8859-1" ?>
<MasterData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.testing.dk">
<Items>
<Item>
<Id>1</Id>
<Width>10</Width>
<Height>100</Height>
<Remark>This is OK</Remark>
</Item>
<Item>
<Id>2</Id>
<Width>20</Width>
<Height>200</Height>
<Remark>This is OK - But is missing Height a non mandatory field</Remark>
</Item>
<Item>
<Id>3</Id>
<Height>300</Height>
<Remark>This is NOT OK - Missing the mandatory Width</Remark>
</Item>
<Item>
<Id>4</Id>
<Width>TheIsAString</Width>
<Height>200</Height>
<Remark>This is NOT OK - Width is not an integer but a string</Remark>
</Item>
<Item>
<Id>5</Id>
<Width>50</Width>
<Height>500</Height>
<Remark>This is OK and the last</Remark>
</Item>
</Items>
</MasterData>
Then I get the this result of the xmllint command:
testXML.xml:18: element Height: Schemas validity error : Element '{http://www.testing.dk}Height': This element is not expected. Expected is ( {http://www.testing.dk}Width ).
testXML.xml:23: element Width: Schemas validity error : Element '{http://www.testing.dk}Width': 'TheIsAString' is not a valid value of the atomic type 'xs:integer'.
testXML.xml fails to validate
And that is all correct - There is two errors in the XML file.
Now I would like to have a tool of some kind to remove entry 3 and 4 so I end up with this result:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<MasterData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.testing.dk">
<Items>
<Item>
<Id>1</Id>
<Width>10</Width>
<Height>100</Height>
<Remark>This is OK</Remark>
</Item>
<Item>
<Id>2</Id>
<Width>20</Width>
<Height>200</Height>
<Remark>This is OK - But is missing Height a non mandatory field</Remark>
</Item>
<Item>
<Id>5</Id>
<Width>50</Width>
<Height>500</Height>
<Remark>This is OK and the last</Remark>
</Item>
</Items>
</MasterData>
Does anybody in here have a tool that can do this?
I'm currently using bash scripting and the xmllint.
I really hope somebody can help.
You can achieve that with this XSLT stylesheet, which you can run in any environment which supports XSLT 1.0 (most languages), using an command line tool such as xsltproc (libxslt) or Saxon, a browser or an online tool. Here is an example.
If you feed your original XML file as input to a XSLT transformer with the following stylesheet it will produce the result you have shown in your second XML:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
xmlns:t="http://www.testing.dk">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="t:Item[t:Id and not(number(t:Id))]"/>
<xsl:template match="t:Item[t:Width and not(number(t:Width))]"/>
<xsl:template match="t:Item[t:Height and not(number(t:Height))]"/>
<xsl:template match="t:Item[not(t:Width)]"/>
<xsl:template match="t:Item[not(t:Id)]"/>
<xsl:template match="t:Item[not(t:Remark)]"/>
</xsl:stylesheet>
The first <xsl:template> block simply copies all nodes from the source tree to the result tree. It has lower precedence than the specific templates which match nodes by their name.
Since matches are done in XPath which requires namespace-qualified selectors, your default namespace was declared in the <xsl:stylesheet> opening tag and mapped to a prefix that was used to qualify the tag names.
Each template uses an XPath expression to test if a specific child element is present or not in Item, or if that child is present, if it is a number (according to the XSD).
I'm using XSLT 1.0, which is more widely supported and should be easier to find in your environment. But if you can use a XSLT 2.0 processor, you could use XSLT 2.0 features such as support for XSD types, and instead of comparing your values to number type, you could compare them to specific types like xsd:integer.
You can verify the transformation performed on your example XML by that stylesheet in this XSLT Fiddle.
If you create a XML document containing the code above and place it in a file named stylesheet.xsl you can run the transformation using xsltproc (which probably exists in your environment) using:
xsltproc stylesheet.xsl testXML.xml > fixedXML.xml
<document>
<element>
<attribut a:name="my-name">My Name</attribut>
<attribut a:parent="parent1">Parent 1</attribut>
</element>
</document>
In this XML document, how to select the node which has the attribut a:name ?
$xmlTest = <<<XML
<?xml version="1.0" encoding="UTF-8" ?>
<document xmlns:a="http://example.org/a">
<element>
<attribut a:name="my-name">My Name</attribut>
<attribut a:parent="parent1">Parent 1</attribut>
</element>
</document>
XML;
$xml = new SimpleXMLElement($xmlTest);
echo current($xml->xpath('//element/attribut[#a:name]'));
If ancestor nodes defines namespaces, I can use them:
> Nokogiri::XML(<<-XML
<?xml version='1.0' encoding='UTF-8'?>
<package xmlns="http://www.idpf.org/2007/opf" version="2.0" unique-identifier="bookid">
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
<dc:creator opf:role="aut">John Doe</dc:creator>
</metadata>
</package>
XML
> xml.at_xpath("//dc:creator[#opf:role='aut']", xml.at_xpath("//xmlns:metadata").namespaces).text
=> "John Doe"
However, what shall I do with following XML?
> Nokogiri::XML(<<-XML
<?xml version='1.0' encoding='UTF-8'?>
<package xmlns="http://www.idpf.org/2007/opf" version="2.0" unique-identifier="bookid">
<metadata>
<dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf" opf:role="aut">John Doe</dc:creator>
</metadata>
</package>
XML
> xml.at_xpath("//dc:creator[#opf:role='aut']", xml.at_xpath("//xmlns:metadata").namespaces).text
Nokogiri::XML::XPath::SyntaxError: Undefined namespace prefix: //dc:creator[#opf:role='aut']
I think xml.remove_namespaces! or literal namespace arguments for at_xpath is last resort.
To programmatically collect all the namespaces, use Document#collect_namespaces.
xml = Nokogiri::XML(xmldata)
ns = xml.collect_namespaces
puts xml.at('//dc:creator[#opf:role="aut"]', ns).text
Output:
John Doe
I have this piece of nokogiri code, thats runs slower then I like on large input. How would you redo this in XSLT? Any other ideas to make it run faster?
# remove namespaces (other then soapenv) from input xml, and move
# them to type attribute.
# xml is Nokogiri::XML object
def cleanup_namespaces(xml)
protected_ns = %w( soapenv )
xml.traverse do |el|
next unless el.respond_to? :namespace
if (ns=el.namespace) &&
!protected_ns.include?(ns.prefix) then
el['type'] = "#{ns.prefix}:#{el.name}"
el.namespace = nil
end
end
xml
end
The sample input I am testing with is:
<?xml version="1.0"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
<ns1:getAccountDTOResponse xmlns:ns1="http://www.example.com/pw/services/PWServices"
soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<getAccountDTOReturn xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:ns2="urn:PWServices"
soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xsi:type="ns2:Account">
<ns1:ID soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xsi:type="xsd:long">0</ns1:ID>
<ns1:accountNumber xsi:type="soapenc:string" />
<ns1:accountType xsi:type="soapenc:string" />
<ns1:clientData xsi:type="soapenc:Array" xsi:nil="true" />
<ns1:name xsi:type="soapenc:string" />
<ns1:parentRef xsi:type="soapenc:string" />
</getAccountDTOReturn>
</ns1:getAccountDTOResponse>
</soapenv:Body>
</soapenv:Envelope>
The expected output is:
<?xml version="1.0"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
<getAccountDTOResponse xmlns:ns1="http://www.example.com/pw/services/PWServices"
soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
type="ns1:getAccountDTOResponse">
<getAccountDTOReturn xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:ns2="urn:PWServices"
soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xsi:type="ns2:Account">
<ID soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xsi:type="xsd:long" type="ns1:ID">0</ID>
<accountNumber xsi:type="soapenc:string"
type="ns1:accountNumber" />
<accountType xsi:type="soapenc:string"
type="ns1:accountType" />
<clientData xsi:type="soapenc:Array" xsi:nil="true"
type="ns1:clientData" />
<name xsi:type="soapenc:string" type="ns1:name" />
<parentRef xsi:type="soapenc:string"
type="ns1:parentRef" />
</getAccountDTOReturn>
</getAccountDTOResponse>
</soapenv:Body>
</soapenv:Envelope>
This input is a SOAP response. A tangential question is, what is the
point of the ns1 type namespace in the SOAP response, and is it
reasonable to throw them away completely. I don't seem to need to
reference them when parsing the response.
This XSLT:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="#* | node()" name="identity">
<xsl:copy>
<xsl:apply-templates select="#* | node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="*[namespace-uri() =
'http://www.example.com/pw/services/PWServices']">
<xsl:element name="{local-name()}">
<xsl:attribute name="type">
<xsl:value-of select="name()"/>
</xsl:attribute>
<xsl:apply-templates select="#* | node()"/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
Against your sample will produce this result:
<?xml version="1.0" encoding="UTF-16"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
<getAccountDTOResponse type="ns1:getAccountDTOResponse" soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<getAccountDTOReturn soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xsi:type="ns2:Account" xmlns:ns1="http://www.example.com/pw/services/PWServices" xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/" xmlns:ns2="urn:PWServices">
<ID type="ns1:ID" soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xsi:type="xsd:long">0</ID>
<accountNumber type="ns1:accountNumber" xsi:type="soapenc:string"></accountNumber>
<accountType type="ns1:accountType" xsi:type="soapenc:string"></accountType>
<clientData type="ns1:clientData" xsi:type="soapenc:Array" xsi:nil="true"></clientData>
<name type="ns1:name" xsi:type="soapenc:string"></name>
<parentRef type="ns1:parentRef" xsi:type="soapenc:string"></parentRef>
</getAccountDTOReturn>
</getAccountDTOResponse>
</soapenv:Body>
</soapenv:Envelope>