How to validate XML against XSD in PHP - validation

I have tried to validate my XML against XSD by following way.
<?php
$xml= new DOMDocument;
$xml->load(xml path);
if ($xml->schemaValidate(xsd path)) {print "valid.\n";} else {print "invalid.\n";}
?>
It gives me following error
DOMDocument::schemaValidate(): Element '{http://www.w3.org/2001/XMLSchema}attribute': The content is not valid. Expected is (annotation?).
To see the error detail I add the libxml_use_internal_errors(true); in validation code as you can see below.
<?php
libxml_use_internal_errors(true);
$xml= new DOMDocument;
$xml->load(xml path);
if ($xml->schemaValidate(xsd path)) {print "valid.\n";} else {print "invalid.\n";}
?>
After adding this I got following warning
Severity: Warning
Message: DOMDocument::schemaValidate(): Invalid Schema
I validated this xml online against my xsd and it is valid.However, in my php code it gives me error and I google these errors and it says the document could be invalid while I am sure the document is correct as I have validated it online. I am bit new in this so may be there is some mistake I am doing in validation code but unable to see.
Here is XML
<Application xmlns="NextGenMALI-Schema" Type="NewApplication">
<Identifier>1584928194</Identifier>
<SalesChannel SalesChannelType="Broker">
<Identifier>I12345</Identifier>
<CompanyName BusinessName="National Finance and Loans"/>
<PersonName>
<NameTitle Value="Mr"/>
<FirstName>Jonson</FirstName>
<Surname>Jonson</Surname>
</PersonName>
<RelatedEntityRef Type="crm_id">abc#yahoo.com</RelatedEntityRef>
<Email>abc#yahoo.com</Email>
</SalesChannel>
<SalesChannel SalesChannelDescription="BDM">
<Identifier/>
<PersonName>
<Surname/>
</PersonName>
</SalesChannel>
<Comment>Test comments.</Comment>
<PartySegment>
<Party Type="Applicant" PrimaryApplicant="Yes">
<Identifier>4</Identifier>
<Privacy AllowCreditCheck="Yes" AllowDirectMarketing="No" SignatureVerification="No" PointVerificationCompleted="No" ExistingCustomer="Yes" AllowThirdPartyDisclosure="Yes">
<PointVerification DocumentType="DriversLicenceAust" DocumentNumber="12345" NameOnDocument="Josie Ann Test" VersionSighted="Original" VerificationCategory="Primary" Photographic="Yes" NameVerified="Yes" AddressVerified="Yes" SignatureVerified="Yes" DOBVerified="Yes">
<PlaceOfIssue>
<City>NSW</City>
</PlaceOfIssue>
<EndDate>2014-11-28</EndDate>
</PointVerification>
<DocumentsSightedBy LoanWriter="No"/>
</Privacy>
<ResponsibleLend/>
<Person FirstHomeBuyer="Yes" CustomerOfLender="Yes" EmployeeOfLender="Yes" Director="No" PreviousName="Jones" Sex="Female">
<PersonName>
<NameTitle Value="Mr"/>
<FirstName>First Name</FirstName>
<OtherName>Middle Name</OtherName>
<Surname>Surname</Surname>
</PersonName>
<DateOfBirth>1993-02-01</DateOfBirth>
<MaritalStatus Status="Single" OtherDescription=""/>
<Dependent Age="2"/><Dependent Age="4"/>
<MothersMaidenName>Mother Maiden Name</MothersMaidenName>
<Residency PermanentInAustralia="Yes" Status="Citizen">
<Country ISO3166="AU"/>
</Residency>
<ContactDetails>
<AddressDetails Residential="Yes" Mailing="No" PriorAddress="No" PostSettlement="No" HousingStatus="Boarding">
<RelatedEntityRef>45e7815c217b8a</RelatedEntityRef>
<StartAndEndDates><StartDate>2019-04-04</StartDate></StartAndEndDates>
</AddressDetails><AddressDetails Residential="No" Mailing="No" PriorAddress="No" PostSettlement="No" HousingStatus="Caravan">
<RelatedEntityRef>45e7815c217ba4</RelatedEntityRef>
<StartAndEndDates><StartDate>2017-02-02</StartDate><EndDate>2017-02-02</EndDate></StartAndEndDates>
</AddressDetails><AddressDetails Residential="No" Mailing="No" PriorAddress="No" PostSettlement="No" HousingStatus="Renting">
<RelatedEntityRef>45e7815c217bac</RelatedEntityRef>
<StartAndEndDates><StartDate>2015-03-01</StartDate><EndDate>2015-03-01</EndDate></StartAndEndDates>
</AddressDetails>
<Email PreferredContactMethod="Yes">asd#yahoo.com</Email>
</ContactDetails>
<Employment OnProbation="No" PrimaryEmployment="Yes" PriorEmployment="No" Role="1112-11" RoleDescription="General Manager">
<PAYE Type="FullTime">
<RelatedEntityRef Type="RelatedParty">1097796810218</RelatedEntityRef>
</PAYE>
<StartAndEndDates>
<StartDate>2005-01-01</StartDate>
</StartAndEndDates>
<EmploymentIncome>
<ValueItem Value="10000">
<Identifier>IDARAHXC-Income</Identifier>
<Income Type="OtherIncome">
<Period Unit="Monthly">
<StartDate>2013-07-01</StartDate>
<EndDate>2014-06-30</EndDate>
</Period>
</Income>
</ValueItem>
</EmploymentIncome>
</Employment>
</Person>
</Party>
</PartySegment>
<AddressSegment>
<AddressWrapper>
<Identifier>45e7815c217b8a</Identifier>
<Address>
<StreetNo>8</StreetNo><Street Type="Street">Street</Street><City>Melbourne</City><State Name="VIC"/><Postcode>3000</Postcode>
<Country ISO3166="AU"/>
</Address>
</AddressWrapper>
<AddressWrapper>
<Identifier>45e7815c217ba4</Identifier>
<Address>
<StreetNo>4</StreetNo><Street Type="Street">Street</Street><City>Melville</City><State Name="WA"/><Postcode>6153</Postcode>
<Country ISO3166="AU"/>
</Address>
</AddressWrapper>
<AddressWrapper>
<Identifier>45e7815c217bac</Identifier>
<Address>
<StreetNo>7</StreetNo><Street Type="Street">Street</Street><City>Shire of Mornington Peninsula</City><State Name="VIC"/><Postcode>3941</Postcode>
<Country ISO3166="AU"/>
</Address>
</AddressWrapper>
</AddressSegment>
<FinancialSegment NoOtherAssets="No" NoLiabilities="No" NoIncome="No">
<ValueItem Value="41000">
<Identifier>1097796822609</Identifier>
<OwnedByAllApplicants/>
<Asset Class="CurrentSecurity" SecurityType="RegisteredMortgage">
<RealEstate Transaction="Purchasing" PrimarySecurity="Yes" PropertyPrimaryPurpose="OwnerOccupied" ApprovalInPrinciple="No" Status="Established" Holding="Sole">
<Residential WillOwn3UnitsInComplex="No" WillOwn25PercentOfComplex="No" OffThePlan="No" Type="FullyDetachedHouse"/>
<EstimatedValue>41000</EstimatedValue>
<ContractPrice LicencedRealEstateAgentContract="No" ContractPriceAmount="41000"/>
<VisitContact Type="Customer"/>
<Location>
<RelatedEntityRef>1415066564285</RelatedEntityRef>
<Title TitleType="Torrens" TenureType="Freehold"/>
</Location>
</RealEstate>
</Asset>
</ValueItem>
</FinancialSegment>
<RelatedPartySegment>
<RelatedParty RelPartyType="Lender">
<Identifier>1097796817156</Identifier>
<CompanyName BusinessName="AMP"/>
</RelatedParty>
</RelatedPartySegment>
<LoanDetailSegment>
<LoanDetails>
<Identifier>1097796802265</Identifier>
<EstimatedSettlement>2014-11-28</EstimatedSettlement>
<LoanPurpose PrimaryPurpose="OwnerOccupied" OwnerBuilderApplication="No">
<LendingPurposeCode ABSCode="ABS-125" PurposeAmount="0.0" Description="Affinity Fixed Rate, 1 Year (Investment)"/>
</LoanPurpose>
<LoanPortion ProductName="Affinity Fixed Rate, 1 Year (Investment)" ProductCode="AFFR1I" StatementCycle="Monthly" PackageName="">
<Identifier>1097796802265-1</Identifier>
<LoanTerm Units="Years" Type="TotalTerm">30</LoanTerm>
<LoanTerm Units="Years" Type="Variable" PaymentType="PrincipalAndInterest"/>
<AmountRequested Amount="41315" BaseAmount="41315"/>
<PaymentPeriod Payments="0.0">
<Period Unit="Monthly"/>
</PaymentPeriod>
</LoanPortion>
</LoanDetails>
<Representation Self="No">
<NominatedRepresentation>
<RelatedEntityRef>1415066458059</RelatedEntityRef>
</NominatedRepresentation>
</Representation>
</LoanDetailSegment>
<CrossSellSegment Enabled="true"/>
</Application>
<!-- NGPH: 237119.297 -->
Please find XSD here (Word limit exceeds so I had to upload it to share with you)
https://easyupload.io/qifnk1
password: myxsd

Related

Ruby nokogiri attribute selector in XML file

this is the xml file:
<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/">
<SOAP-ENV:Body>
<ns1:putResponse
xmlns:ns1="urn:DmsManagerClient">
<result xsi:type="xsd:string">
<?xml version="1.0" encoding="ISO-8859-1"?>
<MESSAGE ID="11c73b9e-687c-4300-baba-b743c26f7c83" TYPE="CUSDMS">
<DELIVERY>
<FROM>
<SENDER>0072000</SENDER>
<SERVICE>eService</SERVICE>
<DATE>2019-03-08T12:27:25</DATE>
</FROM>
<TO>
<DEALER DEALERCODE="0072000" MARKETCODE="1000"/>
</TO>
</DELIVERY>
<CONTENT>
<dms:ComplexResponse ErrorCode="430" ErrorDescription="null : PrivacyUE Mancante" Return="false"
xmlns:dms="http://dmsmanagerservice">
<dms:Element Name="DMSVERSION">2.7</dms:Element>
</dms:ComplexResponse>
</CONTENT>
</MESSAGE>
</result>
</ns1:putResponse>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
I am coding with Ruby and I used Nokogiri and the method xpath to extrapole the "CONTENT" of the file
this is the code:
def extrapolate_error(xml)
doc = Nokogiri::XML(File.open(xml))
doc.xpath('//CONTENT')
end
and this is the result:
[#<Nokogiri::XML::Element:0x1c5ba78 name="CONTENT" children=[
#<Nokogiri::XML::Text:0x1c5b940 "\n">,
#<Nokogiri::XML::Element:0x1c5b8bc name="ComplexResponse" namespace=#<Nokogiri::XML::Namespace:0x1c5b88c prefix="dms" href="http://dmsmanagerservice">
attributes=[
#<Nokogiri::XML::Attr:0x1c5b874 name="ErrorCode" value="430">,
#<Nokogiri::XML::Attr:0x1c5b868 name="ErrorDescription" value="null : PrivacyUE Mancante">,
#<Nokogiri::XML::Attr:0x1c5b85c name="Return" value="false">]
children=[#<Nokogiri::XML::Text:0x1c5b118 "\n">,
#<Nokogiri::XML::Element:0x1c5b094 name="Element" namespace=#<Nokogiri::XML::Namespace:0x1c5b88c prefix="dms" href="http://dmsmanagerservice">
attributes=[#<Nokogiri::XML::Attr:0x1c5b058 name="Name" value="DMSVERSION">]
children=[#<Nokogiri::XML::Text:0x1c5abe4 "2.7">]>,
#<Nokogiri::XML::Text:0x1c5aaac "\n">]>,
#<Nokogiri::XML::Text:0x1c5a974 "\n">]>]
Now I need to enter in it and select some attributes.
In the specific I need this:
name="ErrorCode" value="430"
name="ErrorDescription" value="null : PrivacyUE Mancante"
I do not know how to procceed. Can you help me?
The following should work for you assuming the dms namespace is always the same
doc.xpath('//CONTENT/dms:ComplexResponse', dms: 'http://dmsmanagerservice')
.xpath('#ErrorCode | #ErrorDescription')
.each_with_object({}) do |e,obj|
obj[e.name] = e.text
end
#=> {"ErrorCode"=>"430", "ErrorDescription"=>"null : PrivacyUE Mancante"}
You already understand how you got to //CONTENT so from there we use dms:ComplexResponse to navigate deeper but since this is namespaced we have to provide the namespace reference e.g. dms: 'http://dmsmanagerservice'.
Then we select the attributes we are interested in #ErrorCode and #ErrorDescription.
In XPath the pipe | means UNION (think AND) so we want to select both.
Then we are just building a Hash using the name as the key and the text as the value.
XPath Cheatsheet - Useful resource if you need additional reference
Update
You asked about conditionals so this is what I would propose
ndoc = Nokogiri::XML(doc)
namespaces = ndoc.collect_namespaces
response = ndoc.xpath("//CONTENT/dms:ComplexResponse", namespaces)
if response.xpath("self::node()[#ErrorCode != '' and #ErrorDescription != '']").any?
response.xpath("#ErrorCode | #ErrorDescription")
.each_with_object({}) do |e,obj|
obj[e.name] = e.text
end
else
response.xpath('dms:Element/#Name | dms:Element/text()',namespaces)
.each_slice(2)
.map {|s| s.map(&:text)}.to_h
end
This checks to see if there is an ErrorCode and and ErrorDescription if so then Hash as originally proposed. If Not then it returns all the dms:Elements as a Hash so {"DMSVERSION"=>"2.7"} in this case Functional Example

KML exported from Maps fails validation against its own schema

I imported an Earth model into My Maps, then exported it to a KMZ. I unzipped the KMZ and ran the resulting doc.kml through a a validator (XmlValidator) against the XSD at http://schemas.opengis.net/kml/2.2.0/ogckml22.xsd.
The response?
C:\Users\Bugmagnet\Downloads\XmlValidate-master>bin\xv.bat -v -kmz ..\doc.kml
Check: ..\doc.kml
http://www.opengis.net/kml/2.2
ERROR: SAXParseException org.xml.sax.SAXParseException; lineNumber: 3319; columnNumber: 38; cvc-complex-type.2.4.a:
Invalid content was found starting with element 'Style'. One of '{"http://www.opengis.net/kml/2.2":AbstractFeatureGroup, "http://www.opengis.net/kml/2.2":DocumentSimpleExtensionGroup, "http://www.opengis.net/kml/2.2":DocumentObjectExtensionGroup}' is expected.
Line: 3319, column: 38
3319: <Style id="line-000000-1-normal">***
ERROR: SAXParseException org.xml.sax.SAXParseException; lineNumber: 3319; columnNumber: 38; cvc-complex-type.2.4.a:
Invalid content was found starting with element 'Style'. One of '{"http://www.opengis.net/kml/2.2":AbstractFeatureGroup, "http://www.opengis.net/kml/2.2":DocumentSimpleExtensionGroup, "http://www.opengis.net/kml/2.2":DocumentObjectExtensionGroup}' is expected.
Line: 3319, column: 38
Errors: 2 Warnings: 0 Files: 1 Time: 3659 ms
Valid files 0/1 (0%)
Is this important? Will it bite me later? What, if anything, can or should be done?
I'm asking because I'm using the KML exported from Maps as a template for generating KML programmatically for use in Maps.
Strangely, the first instance of the markup <Style id="line-000000-1-normal"> is not on line 3319 but on 4022 being
<Style id='line-000000-1-normal'>
<LineStyle>
<color>ff000000</color>
<width>1</width>
</LineStyle>
</Style>
Line 3199 is part way through a Placemark, on the , viz
<Placemark>
<name>LPVGDatumLutID {133}- Swan Hill</name>
<description>
<![CDATA[Log Provider [10] Google Analytics V3]]>
</description>
<styleUrl>#poly-000000-1-76</styleUrl>
<ExtendedData>
</ExtendedData>
<Polygon>
<outerBoundaryIs>
<LinearRing>
<tessellate>1</tessellate>
<coordinates>143.5614,-35.250688,0.0 143.555643,-35.250811,0.0 143.549902,-35.25118,0.0 143.544192,-35.251793,0.0 143.538529,-35.25264899999999,0.0 143.532928,-35.253746,0.0 143.527406,-35.25508000000001,0.0 143.521976,-35.256649,0.0 143.516654,-35.258447,0.0 143.511454,-35.26046899999999,0.0 143.506391,-35.26271100000001,0.0 143.501478,-35.265166,0.0 143.496729,-35.267828,0.0 143.492156,-35.270688,0.0 143.487773,-35.27373999999999,0.0 143.483592,-35.276975,0.0 143.479623,-35.280383,0.0 143.475877,-35.283957,0.0 143.472366,-35.287686,0.0 143.469098,-35.29155900000001,0.0 143.466083,-35.295567000000005,0.0 143.463328,-35.299698,0.0 143.460842,-35.30394100000001,0.0 143.458631,-35.308284,0.0 143.456702,-35.312715,0.0 143.455059,-35.317223,0.0 143.453707,-35.32179500000001,0.0 143.45265,-35.326419,0.0 143.451892,-35.331081,0.0 143.451433,-35.335769,0.0 143.451276,-35.34047000000001,0.0 143.45142,-35.345172,0.0 143.451866,-35.349861,0.0 143.452613,-35.354524,0.0 143.453657,-35.35914900000001,0.0 143.454998,-35.363723,0.0 143.45663,-35.368234,0.0 143.458549,-35.372668999999995,0.0 143.460751,-35.377016000000005,0.0 143.46322900000004,-35.381263000000004,0.0 143.465977,-35.385398,0.0 143.468986,-35.389411,0.0 143.47224900000003,-35.39328900000001,0.0 143.475757,-35.397023,0.0 143.479501,-35.400601,0.0 143.483469,-35.404015,0.0 143.487652,-35.407255000000006,0.0 143.492037,-35.410312,0.0 143.496612,-35.413178,0.0 143.501366,-35.415844,0.0 143.506285,-35.418304,0.0 143.511355,-35.42055,0.0 143.516563,-35.422577,0.0 143.521894,-35.424378999999995,0.0 143.527334,-35.42595,0.0 143.532867,-35.427287,0.0 143.538479,-35.428386,0.0 143.544154,-35.429244,0.0 143.549876,-35.429859,0.0 143.55563,-35.430228,0.0 143.5614,-35.430352,0.0 143.56717,-35.430228,0.0 143.572924,-35.429859,0.0 143.578646,-35.429244,0.0 143.584321,-35.428386,0.0 143.589933,-35.427287,0.0 143.595466,-35.42595,0.0 143.600906,-35.424379,0.0 143.606237,-35.422577,0.0 143.611445,-35.42055,0.0 143.616515,-35.418304000000006,0.0 143.621434,-35.41584400000001,0.0 143.626188,-35.413178,0.0 143.630763,-35.410312,0.0 143.635148,-35.407255000000006,0.0 143.639331,-35.404015,0.0 143.643299,-35.400601,0.0 143.647043,-35.397023,0.0 143.650551,-35.393289,0.0 143.653814,-35.389411,0.0 143.656823,-35.385398,0.0 143.659571,-35.381263000000004,0.0 143.662049,-35.377016000000005,0.0 143.664251,-35.372669,0.0 143.66617,-35.368234,0.0 143.667802,-35.363723,0.0 143.669143,-35.35914900000001,0.0 143.670187,-35.354524,0.0 143.670934,-35.349861000000004,0.0 143.67138,-35.345172,0.0 143.671524,-35.34047000000001,0.0 143.671367,-35.335769000000006,0.0 143.670908,-35.331081,0.0 143.67015,-35.32641900000001,0.0 143.669093,-35.321795,0.0 143.667741,-35.317223,0.0 143.666098,-35.312715,0.0 143.664169,-35.308284,0.0 143.661958,-35.30394100000001,0.0 143.659472,-35.29969799999999,0.0 143.656717,-35.295567000000005,0.0 143.653702,-35.291559,0.0 143.650434,-35.287686,0.0 143.646923,-35.283957,0.0 143.643177,-35.280383,0.0 143.639208,-35.276975,0.0 143.635027,-35.273740000000004,0.0 143.630644,-35.270688,0.0 143.626071,-35.267828,0.0 143.621322,-35.265166,0.0 143.616409,-35.262710999999996,0.0 143.611346,-35.260469,0.0 143.606146,-35.258447,0.0 143.600824,-35.256649,0.0 143.595394,-35.25508000000001,0.0 143.589872,-35.253746,0.0 143.584271,-35.25264899999999,0.0 143.57860800000003,-35.251793,0.0 143.572898,-35.25118,0.0 143.567157,-35.250811,0.0 143.5614,-35.250688,0.0</coordinates>
</LinearRing>
</outerBoundaryIs>
</Polygon>
</Placemark>
LATER
More weirdness: I imported the kml into Fusion Charts and then exported it. The KML now has the style information before the placemarks, and validates perfectly.
Solution: Put the style information first.
<?xml version='1.0' encoding='UTF-8'?>
<kml xmlns='http://www.opengis.net/kml/2.2'>
<Document>
<name>doc</name>
<Style id='Style2-point-1'>
<IconStyle>
<color>ff0000ff</color>
<scale>1.0</scale>
<Icon>
<href>http://maps.google.com/mapfiles/kml/shapes/placemark_circle.png</href>
</Icon>
</IconStyle>
<LabelStyle>
<scale>0.0</scale>
</LabelStyle>
<BalloonStyle>
<text>$[description]</text>
</BalloonStyle>
</Style>
<Style id='Style2-point-1-hover'>
<IconStyle>
<color>ff0000ff</color>
<scale>1.0</scale>
<Icon>
<href>http://maps.google.com/mapfiles/kml/shapes/placemark_circle.png</href>
</Icon>
</IconStyle>
<BalloonStyle>
<text>$[description]</text>
</BalloonStyle>
</Style>
<StyleMap id='Style2-point-1-map'>
<Pair>
<key>normal</key>
<styleUrl>#Style2-point-1</styleUrl>
</Pair>
<Pair>
<key>highlight</key>
<styleUrl>#Style2-point-1-hover</styleUrl>
</Pair>
</StyleMap>
<Placemark>
<name>LPVGDatumLutID {11}- Canberra</name>
<snippet/>
<description>
<![CDATA[<div class="googft-info-window">
<b>description:</b> Log Provider [10] Google Analytics V3<br>
<b>name:</b> LPVGDatumLutID {11}- Canberra
</div>]]></description>
<styleUrl>#Style2-polygon-3-map</styleUrl>
<ExtendedData/>
<Polygon>
<outerBoundaryIs>
<LinearRing>
<coordinates>148.836481,-35.026301,0.0 148.833751,-35.028007,0.0 148.816238,-35.040183,0.0 148.799522,-35.053091,0.0 148.783649,-35.066695,0.0 148.768662,-35.080959,0.0 148.754603,-35.095843,0.0 148.74151,-35.111308,0.0 148.729419,-35.127311,0.0 148.718365,-35.143808,0.0 148.708377,-35.160755,0.0 148.699483,-35.178105,0.0 148.691709,-35.195812,0.0 148.685076,-35.213826,0.0 148.679604,-35.2321,0.0 148.675308,-35.250581,0.0 148.672201,-35.269222,0.0 148.670291,-35.287969,0.0 148.669585,-35.306772,0.0 148.670086,-35.32558,0.0 148.671794,-35.34434,0.0 148.674704,-35.363002,0.0 148.678809,-35.381513,0.0 148.684099,-35.399824,0.0 148.69056,-35.417884,0.0 148.698174,-35.435644,0.0 148.706923,-35.453053,0.0 148.716782,-35.470065,0.0 148.727726,-35.486632,0.0 148.739723,-35.502709,0.0 148.752743,-35.518252,0.0 148.766749,-35.533217,0.0 148.781704,-35.547563,0.0 148.797566,-35.561251,0.0 148.814293,-35.574242,0.0 148.831838,-35.586501,0.0 148.850153,-35.597994,0.0 148.869187,-35.608688,0.0 148.888889,-35.618555,0.0 148.909204,-35.627566,0.0 148.930075,-35.635698,0.0 148.951446,-35.642927,0.0 148.973256,-35.649233,0.0 148.995445,-35.654599,0.0 149.017952,-35.659009,0.0 149.040715,-35.662453,0.0 149.063669,-35.664919,0.0 149.086753,-35.666401,0.0 149.104176,-35.666774,0.0 149.1099,-35.666896,0.0 149.133047,-35.666401,0.0 149.156131,-35.664919,0.0 149.179085,-35.662453,0.0 149.201848,-35.659009,0.0 149.224355,-35.654599,0.0 149.246544,-35.649233,0.0 149.268354,-35.642927,0.0 149.289725,-35.635698,0.0 149.310596,-35.627566,0.0 149.330911,-35.618555,0.0 149.350613,-35.608688,0.0 149.369647,-35.597994,0.0 149.387962,-35.586501,0.0 149.405507,-35.574242,0.0 149.422234,-35.561251,0.0 149.438096,-35.547563,0.0 149.453051,-35.533217,0.0 149.467057,-35.518252,0.0 149.480077,-35.502709,0.0 149.492074,-35.486632,0.0 149.503018,-35.470065,0.0 149.512877,-35.453053,0.0 149.521626,-35.435644,0.0 149.52924,-35.417884,0.0 149.535701,-35.399824,0.0 149.540991,-35.381513,0.0 149.545096,-35.363002,0.0 149.548006,-35.34434,0.0 149.549714,-35.32558,0.0 149.550215,-35.306772,0.0 149.549509,-35.287969,0.0 149.547599,-35.269222,0.0 149.544492,-35.250581,0.0 149.540196,-35.2321,0.0 149.537488,-35.223056,0.0 149.534724,-35.213826,0.0 149.528091,-35.195812,0.0 149.520317,-35.178105,0.0 149.511423,-35.160755,0.0 149.501435,-35.143808,0.0 149.490381,-35.127311,0.0 149.47829,-35.111308,0.0 149.465197,-35.095843,0.0 149.451138,-35.080959,0.0 149.436151,-35.066695,0.0 149.420278,-35.053091,0.0 149.403562,-35.040183,0.0 149.386049,-35.028007,0.0 149.367787,-35.016595,0.0 149.348826,-35.005979,0.0 149.329217,-34.996186,0.0 149.309014,-34.987245,0.0 149.288271,-34.979178,0.0 149.267046,-34.972008,0.0 149.245395,-34.965755,0.0 149.223377,-34.960435,0.0 149.201052,-34.956062,0.0 149.178481,-34.952648,0.0 149.155724,-34.950204,0.0 149.132843,-34.948734,0.0 149.1099,-34.948244,0.0 149.104176,-34.948367,0.0 149.086957,-34.948734,0.0 149.064076,-34.950204,0.0 149.041319,-34.952648,0.0 149.018748,-34.956062,0.0 148.996423,-34.960435,0.0 148.974405,-34.965755,0.0 148.952754,-34.972008,0.0 148.931529,-34.979178,0.0 148.910786,-34.987245,0.0 148.890583,-34.996186,0.0 148.870974,-35.005979,0.0 148.852013,-35.016595,0.0 148.836481,-35.026301,0.0</coordinates>
</LinearRing>
</outerBoundaryIs>
</Polygon>
</Placemark>
</Document>
</kml>

Seperate XML content from a single XML file using XQuery

I have a XML file which contains multiple XML nodes. I would like to separate two XML notes and store them in separate variables. How would I write this functionality with XQuery? I have added my XML file below. Inside the XML file I have a division root element, Dive and top-song are two child elements. Now I want to read the Dive XML content in one variable and top-song content in another variable. Can any one please help me to sort out this issue?
<?xml version="1.0" encoding="UTF-8"?>
<division>
<Dive ID="2"><!-- I want this node in one variable -->
<DiverFName>Joe</DiverFName>
<DiverLName>Diver</DiverLName>
<Number>2</Number>
<Divedate>1998-03-30</Divedate>
<Country ID="1">Bahamas</Country>
<City ID="2">Freeport</City>
<Place ID="2">
<Site>South Pass</Site>
<Lat>24.865062</Lat>
<Lon>-77.871094</Lon>
</Place>
<Divetime>36.00</Divetime>
<Depth Scale="METRIC">5.48</Depth>
<Buddy IDs="2" Names="Tim Diver" />
<Comments>Great dive, saw 5 Caribbean Reef Sharks. Performed compass navigation skills for Scuba Diver certification.</Comments>
<Water>Salt</Water>
<Entry>Boat</Entry>
<Divetype>Research</Divetype>
<Tanktype>Alu</Tanktype>
<Tanksize>11.43</Tanksize>
<PresS>179.26</PresS>
<PresE>82.73</PresE>
<Gas>Air</Gas>
<Weather>Clear</Weather>
<UWCurrent>Medium Current</UWCurrent>
<MarineLife>
<Animal>
<Type>Nurse Shark</Type>
<Abundance>1</Abundance>
<Size>3 ft</Size>
<Description>Dormant on the bottom, not swimming.</Description>
<Image>
<Filename></Filename>
<Path></Path>
<Caption></Caption>
</Image>
</Animal>
<Animal>
<Type>Blue Tang Surgeonfish</Type>
<Abundance>25+</Abundance>
<Size>4 in</Size>
<Description>Blue with white "scalpel" near base </descreption>
<Image>
<Filename></Filename>
<Path></Path>
<Caption></Caption>
</Image>
</Animal>
</MarineLife>
</Dive>
<top-song><!-- I want this node in another variable -->
<title >Try Again</title>
<artist >Aaliyah</artist>
<weeks last="2008-06-17">
<week>2008-06-17</week>
</weeks>
<album> The
Album</album>
<released>February 29, 20008</released>
<formats>
<format>CD</format>
<format>12 single</format>
</formats>
<recorded>january2012</recorded>
<genres>
<genre>R&B</genre>
</genres>
<lengths>
<length>4:04</length>
</lengths>
<label>Blackground</label>
<writers>
<writer></writer>
<writer></writer>
</writers>
<producers>
<producer></producer>
</producers>
<descr>
<p>hai hello</p>
</descr>
</top-song>
</division>
It's not clear what you're trying to accomplish on a high level, but you can select those elements with some simple XQuery/Xpath:
let $dive := doc('mydoc.xml')/division/Dive
let $top-song := doc('mydoc.xml')/division/top-song
However, just looking at the document it's clear that these two elements are in totally unrelated schemas, and as a general recommendation for MarkLogic, they should probably each be separated before ingestion and inserted as separate documents.

NSXMLDocument, nodesForXPath with namespaces

I want to get a set of elements from a xml-file, but as soon the the elements involve namespaces, it fails.
This is a fragment of the xml file:
<gpx xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
version="1.0" creator="Groundspeak Pocket Query"
xsi:schemaLocation="http://www.topografix.com/GPX/1/0 http://www.topografix.com/GPX/1/0/gpx.xsd http://www.groundspeak.com/cache/1/0 http://www.groundspeak.com/cache/1/0/cache.xsd"
xmlns="http://www.topografix.com/GPX/1/0">
<name>My Finds Pocket Query</name>
<desc>Geocache file generated by Groundspeak</desc>
<author>Groundspeak</author>
<email>contact#groundspeak.com</email>
<time>2010-09-15T16:18:55.9846906Z</time>
<keywords>cache, geocache, groundspeak</keywords>
<bounds minlat="41.89687" minlon="5.561883" maxlat="70.669967" maxlon="25.74735" />
<wpt lat="62.244933" lon="25.74735">
<time>2010-01-11T08:00:00Z</time>
<name>GC22W1T</name>
<desc>Kadonneet ja karanneet by ooti, Traditional Cache (1.5/2)</desc>
<url>http://www.geocaching.com/seek/cache_details.aspx?guid=4af28fe9-401b-44df-b058-5fd5399fc083</url>
<urlname>Kadonneet ja karanneet</urlname>
<sym>Geocache Found</sym>
<type>Geocache|Traditional Cache</type>
<groundspeak:cache id="1521507" available="True" archived="False" xmlns:groundspeak="http://www.groundspeak.com/cache/1/0">
<groundspeak:name>Kadonneet ja karanneet</groundspeak:name>
<groundspeak:placed_by>ooti</groundspeak:placed_by>
<groundspeak:owner id="816431">ooti</groundspeak:owner>
<groundspeak:type>Traditional Cache</groundspeak:type>
<groundspeak:container>Small</groundspeak:container>
<groundspeak:difficulty>1.5</groundspeak:difficulty>
<groundspeak:terrain>2</groundspeak:terrain>
<groundspeak:country>Finland</groundspeak:country>
<groundspeak:state>
</groundspeak:state>
<groundspeak:short_description html="True">
</groundspeak:short_description>
<groundspeak:encoded_hints>
</groundspeak:encoded_hints>
<groundspeak:travelbugs />
</groundspeak:cache>
</wpt>
</gpx>
I want to get all the grounspeak:cache elements, but neither //groundspeak:cache nor //cache seems to return anything.
NSArray *caches = [self.xml nodesForXPath:#"//cache" error:&error];
Any clue?
Edit: Are there any cocoa-based software out there, where I can load my xml and test different xpaths? I'm quite new to objective-c and cocoa, so it would be nice to check that it is really my xpath that is wrong..
This //cache means: a descendant element under no namespace (or empty namespace)
Your groundspeak:cache element is under a namespace URI http://www.groundspeak.com/cache/1/0.
So, if you can't declare a namespace-prefix binding (I think you can't with cocoa...), you could use this XPath expression:
//*[namespace-uri()='http://www.groundspeak.com/cache/1/0' and
local-name()='cache']
If you don't want to be so strict about namespace...
//*[local-name()='cache']
But this last is a bad practice, because you could end up selecting wrong nodes, and because when dealing with XML, your tool should support namespaces.
As proof, this stylesheet:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:copy-of select="//*[namespace-uri() =
'http://www.groundspeak.com/cache/1/0' and
local-name() = 'cache']"/>
</xsl:template>
</xsl:stylesheet>
Output:
<groundspeak:cache id="1521507" available="True" archived="False"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns="http://www.topografix.com/GPX/1/0"
xmlns:groundspeak="http://www.groundspeak.com/cache/1/0">
<groundspeak:name>Kadonneet ja karanneet</groundspeak:name>
<groundspeak:placed_by>ooti</groundspeak:placed_by>
<groundspeak:owner id="816431">ooti</groundspeak:owner>
<groundspeak:type>Traditional Cache</groundspeak:type>
<groundspeak:container>Small</groundspeak:container>
<groundspeak:difficulty>1.5</groundspeak:difficulty>
<groundspeak:terrain>2</groundspeak:terrain>
<groundspeak:country>Finland</groundspeak:country>
<groundspeak:state></groundspeak:state>
<groundspeak:short_description html="True"></groundspeak:short_description>
<groundspeak:encoded_hints></groundspeak:encoded_hints>
<groundspeak:travelbugs />
</groundspeak:cache>
You need to add a new namespace attribute to the root node of your document, defining a prefix that you can use when querying the children:
NSXMLDocument *xmldoc = ...
NSXMLElement *namespace = [NSXMLElement namespaceWithName:#"mns" stringValue:#"http://mynamespaceurl.com/mynamespace"];
[xmldoc.rootElement addNamespace:namespace];
then when you query things later, you can use that prefix to refer to the namespace:
NSArray * caches = [xmldoc.rootElement nodesForXPath:#"//mns:caches" error:&error];
//groundspeak:cache should work. You might need a namespace-uri setting as well

Trying to parse a XML using Nokogiri with Ruby

I am new to programming so bear with me. I have an XML document that looks like this:
File name: PRIDE1542.xml
<ExperimentCollection version="2.1">
<Experiment>
<ExperimentAccession>1015</ExperimentAccession>
<Title>**Protein complexes in Saccharomyces cerevisiae (GPM06600002310)**</Title>
<ShortLabel>GPM06600002310</ShortLabel>
<Protocol>
<ProtocolName>**None**</ProtocolName>
</Protocol>
<mzData version="1.05" accessionNumber="1015">
<cvLookup cvLabel="RESID" fullName="RESID Database of Protein Modifications" version="0.0" address="http://www.ebi.ac.uk/RESID/" />
<cvLookup cvLabel="UNIMOD" fullName="UNIMOD Protein Modifications for Mass Spectrometry" version="0.0" address="http://www.unimod.org/" />
<description>
<admin>
<sampleName>**GPM06600002310**</sampleName>
<sampleDescription comment="Ho, Y., et al., Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature. 2002 Jan 10;415(6868):180-3.">
<cvParam cvLabel="NEWT" accession="4932" name="Saccharomyces cerevisiae (Baker's yeast)" value="Saccharomyces cerevisiae" />
</sampleDescription>
</admin>
</description>
<spectrumList count="0" />
</mzData>
</Experiment>
</ExperimentCollection>
I want to take out the text in between <Title>, <ProtocolName>, and <SampleName> and put into a text file (I tried bolding them to making it easier to see). I have the following code so far (based on posts I saw on this site), but it seems not to work:
>> require 'rubygems'
>> require 'nokogiri'
>> doc = Nokogiri::XML(File.open("PRIDE_Exp_Complete_Ac_10094.xml"))
>> #ExperimentCollection = doc.css("ExperimentCollection Title").map {|node| node.children.text }
Can someone help me?
Try to access them using xpath expressions. You can enter the path through the parse tree using slashes.
puts doc.xpath( "/ExperimentCollection/Experiment/Title" ).text
puts doc.xpath( "/ExperimentCollection/Experiment/Protocol/ProtocolName" ).text
puts doc.xpath( "/ExperimentCollection/Experiment/mzData/description/admin/sampleName" ).text

Resources