I'm working with some XML that keeps all its data in attributes. I'm needing to extract that for processing the data. So we have something like:
<data>
<level1>
<level2 att1="1">
<level 3>
<level 4 att2="a" att3="b" att4="c" />
</level 3>
<level 3>
<level 4 att2="1" att3="2" att4="3" />
</level 3>
</level 2>
<level 2 att1="2">
So i'm wanting all attributes from the level 4 entries and their values, but ONLY from level2 entries where att1 = 1. I'm new to XPath, so I haven't worked it out yet. So far, the best I've come up with is:
/data/level1/level2[#att1='1']/level3/level4[#*]
but that is returning empty data. Any help would be appreciated.
To get all the desired attribute values, you only have to make a little change to your XPath expression like this (remove the last predicate to directly access the attributes):
/data/level1/level2[#att1='1']/level3/level4/#*
Iterating over this expression's nodeset gives you all the desired values and, if wanted, their attribute's names.
Related
I am trying to validate the following XML using the Schematron rule.
XML:
<?xml version="1.0" encoding="utf-8"?>
<Biotic><Maul><Number>1</Number>
<Record><Code IDREF="a1"/>
<Detail><ItemID>1</ItemID></Detail>
<Detail><ItemID>3</ItemID></Detail>
</Record>
<Record><Code IDREF="b1"/>
<Detail><ItemID>3</ItemID></Detail>
<Detail><ItemID>4</ItemID></Detail>
</Record>
<Record><Code IDREF="b1"/>
<Detail><ItemID>4</ItemID></Detail>
<Detail><ItemID>6</ItemID></Detail>
</Record>
<Record><Code IDREF="c1"/>
<Detail><ItemID>5</ItemID></Detail>
<Detail><ItemID>5</ItemID></Detail>
</Record>
</Maul></Biotic>
And the check is "ItemID should be unique for the given Code within the given Maul."
So as per requirement Records with Code b1 is not valid because ItemId 4 exists in both records.
Similarly, record C1 is also not valid because c1 have two nodes with itemId 5.
Record a1 is valid, even ItemID 3 exists in the next record but the code is different.
Schematron rule I tried:
<?xml version="1.0" encoding="utf-8" ?><schema xmlns="http://purl.oclc.org/dsdl/schematron" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<title>Schematron validation rule</title>
<pattern id="P1">
<rule context="Maul/Record" id="R1">
<let name="a" value="//Detail/[./ItemID, ../Code/#IDREF]"/>
<let name="b" value="current()/Detail/[./ItemID, ../Code/#IDREF]"/>
<assert test="count($a[. = $b]) = count($b)">
ItemID should be unique for the given Code within the given Maul.
</assert>
</rule>
</pattern>
</schema>
The two let values seem problematic. They will each return a Detail element (and all of its content including attributes, child elements, and text nodes). I'm not sure what the code inside the predicates [./ItemID, ../Code/#IDREF] is going to, but I think it will return all Detail elements that have either a child ItemID element or a sibling Code element with an #IDREF attribute, regardless of what the values of ItemID or #IDREF are.
I think I would change the rule/#context to ItemID, so the assert would fail once for each ItemID that violates the constraint.
Here are a rule and assert that work correctly:
<?xml version="1.0" encoding="utf-8" ?><schema xmlns="http://purl.oclc.org/dsdl/schematron" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<title>Schematron validation rule</title>
<pattern id="P1">
<rule context="Maul/Record/Detail/ItemID" id="R1">
<assert test="count(ancestor::Maul/Record[Code/#IDREF = current()/ancestor::Record/Code/#IDREF]/Detail/ItemID[. = current()]) = 1">
ItemID should be unique for the given Code within the given Maul.
</assert>
</rule>
</pattern>
</schema>
The assert test finds, within the ancestor Maul, any Record that has a Code/#IDREF that equals the Code/#IDREF of the Record that the current ItemID is in. At minimum, it will find one Record (the one that the current ItemID is in). Then it looks for any Detail/ItemID within those Records that is equal to the current ItemID. It will find at least one (the current ItemID). The count function counts how many ItemIDs are found. If more than one is found, the assert fails.
Thanks for the reference to https://www.liquid-technologies.com/online-schematron-validator! I wasn't aware of that tool.
I have this data, and I'm looking for the lowest bid.
<root>
<current_bid>$1.00</current_bid>
<current_bid>$2.00</current_bid>
<current_bid>$3.00</current_bid>
<current_bid>$4.00</current_bid>
<current_bid>$5.00</current_bid>
</root>
This is my XPath 1.0 attempt:
//current_bid[not(translate (., '$,.','') > translate(//current_bid, '$,.',''))]
And it works fine (returns only the $1.00 bid) with the data above, but if I change the ordering of the data to let's say this here:
<root>
<current_bid>$5.00</current_bid>
<current_bid>$1.00</current_bid>
<current_bid>$2.00</current_bid>
<current_bid>$3.00</current_bid>
<current_bid>$4.00</current_bid>
</root>
Then it gives a wrong output (returns all values).
Shouldn't the order be irrelevant when I use //current_bid, since it queries the whole document?
Also: how would I go if I wanted the second lowest bid?
XPath 1.0 processes nodes in document order so there's no way to sort them with pure XPath. It can be done with XSL processing
This approach works only if minimum is at first position.
Xpath:
'//current_bid[(position()<=last()) and not(translate (., "$,.","") > translate(//current_bid, "$,.",""))]'
Sample:
<root>
<current_bid>$1.00</current_bid>
<current_bid>$5.00</current_bid>
<current_bid>$2.00</current_bid>
<current_bid>$4.00</current_bid>
<current_bid>$3.00</current_bid>
</root>
Testing on command line with xmllint
xmllint --xpath '//current_bid[(position()<=last()) and not(translate (., "$,.","") > translate(//current_bid, "$,.",""))]' test.xml ; echo
Result:
<current_bid>$1.00</current_bid>
If the number of nodes is known in advance perhaps it could be done with nested conditions but would give a very complex XPath expression.
I would like to know if the following XPath expression can be simplified:
//map[requester/#type='2' and requester/code]
Some test data:
<root>
<map>
<requester type="2">
<code>a</code>
<code>b</code>
</requester>
</map>
...
</root>
My objective is to get only map elements which have at least one requester with type attribute and value '2' and also have at least one code element.
For your use case, this is probably as simple as it could be. However, it doesn't match what you are describing doing.
Here you are selecting map elements where
There is a requester element with type attribute equal to 2
There is a requester element with a code element
The requester elements in (1) and (2) are not necessarily the same
For example, the map element in the following is selected:
<root>
<map>
<requester type="2"/>
<requester>
<code>a</code>
</requester>
</map>
</root>
If you want the elements in (1) and (2) to be the same, you should use (simplified slightly at the suggestion of kjhughes)
//map[requester[#type='2']/code]
Here we select all map elements which have a requester element which in turn has an attribute type with a value of 2 and a code element.
I have the following xml structure:
<?xml version="1.0" encoding="UTF-8"? >
<sql>
<Assoc name="sql">
<RecArray name="contents">
<Record name="contents">
<String name="PackType" > < value actual="P" />< /String >
<String name="SerialNumber" > < value actual="0002" />< /String >
<String name="VersionNumber" > < value actual="02" /></ String >
</Record>
</RecArray>
</Assoc>
</sql>
how can i get the values of each of the String nodes like i need to know the value inside the node of "SerialNumber"
Regards,
If you wan to get all <value> elements inside each <String> element, you can try this XPath query :
/sql/Assoc/RecArray/Record/String/value
precise path will be better performance wise. If you're looking for simpler query, this will also work :
//String/value
or if you mean by values of each of the String nodes is value of actual attribute, you can do this way :
/sql/Assoc/RecArray/Record/String/value/#actual
Finally, if none of above meet your requirement, please update the question and provide expected output from sample XML posted.
i figured it out
as it is multi String elements (that was clear in the question), i should use the following
/sql/Assoc/RecArray/Record/String[2]/value/#actual
I have a xml like
<root xmlns:ns1="http://foo">
<ns1:child1>Text</ns1:child1>
<ns1:child2>Number</ns1:child2>
</root>
Now I get this from different persons, so that for example person 2 sends me another message with the same structure like
<root xmlns:anotherNs="http://foo">
<anotherNs:child1>Another Text</anotherNs:child1>
<anotherNs:child2>Another Number</anotherNs:child2>
</root>
So the only difference is the name of the namespace. How can I select the content of child2 for both xml's with one XPath expression?
Something like "/root/child2" or "//child2" did not work.
Use the local-name() function like so:
//*[local-name()='child2']
You can bind any prefix you like (say banana) to the namespace "http://foo", and the expression /root/banana:child2 will find the child2 element, regardless what namespace prefix has been used in the source document. Only the namespace URI has to match.