XSLT Sort parent node based on specific attribute of a child - sorting

first post ever, have done lots of searching but cannot find an answer specific enough or more importantly, relevant enough. Note that I am a business analyst, not a developer, so I may be missing some understanding here.
We produce XML that we then process to produce a report. Where data can be represented by a table, the XML contains details for the table title (ELEMENT_HEADING), table header row (PROMPTS), then repeating nodes representing the rows (DATA) and columns (VALUES).
The problem I am facing is I need to sort the DATA node based on a text value of the node where the node has a specific attribute value.
In the sample XML provided below, I need to sort the DATA node based on the VALUE text value for the attribute #pic='TRORGPCNT' in ascending order i.e the DATA node with TRORGPCNT of 10 should appear before the DATA node with 90. Then when the report is produced the table rows are in ascending percentage order.
I hope have explained myself clearly enough :)
Any tips on how I might accomplish this?
Sample XML:
<PROPOSAL_ELEMENT multi="Y" pec="TEACHRESP" elem_mandatory="N" elem_visible="Y">
<ELEMENT_HEADING pec="TEACHRESP">Teaching Responsibility</ELEMENT_HEADING>
<PROMPTS>
<PROMPT pic="TRORGUN" item_mandatory="Y" item_visible="Y">Faculty or School with teaching responsibility</PROMPT>
<PROMPT pic="TRORGPCNT" item_mandatory="Y" item_visible="Y">Teaching responsibility %</PROMPT>
</PROMPTS>
<DATA elem_mandatory="N" elem_visible="Y" delete_ind="N">
<VALUES>
<VALUE pic="TRORGUN" item_mandatory="Y" item_visible="Y" item_description="FACULTY OF NURSING AND HEALTH" display_in_summary_tab="Y" summary_order="">FACULTY OF NURSING AND HEALTH</VALUE>
<VALUE pic="TRORGPCNT" item_mandatory="Y" item_visible="Y" item_description="" display_in_summary_tab="Y" summary_order="">90</VALUE>
</VALUES>
</DATA>
<DATA elem_mandatory="N" elem_visible="Y" delete_ind="N">
<VALUES>
<VALUE pic="TRORGUN" item_mandatory="Y" item_visible="Y" item_description="FACULTY OF ARTS" display_in_summary_tab="Y" summary_order="">FACULTY OF ARTS</VALUE>
<VALUE pic="TRORGPCNT" item_mandatory="Y" item_visible="Y" item_description="" display_in_summary_tab="Y" summary_order="">10</VALUE>
</VALUES>
</DATA>

Sorting in XSLT is accomplished using the xsl:sort instruction, which must appear as the first child of the for-each or apply-templates that selects the nodes you want to sort. If you're selecting the set of DATA element nodes then an appropriate sorting instruction would be
<xsl:sort select="VALUES/VALUE[#pic='TRORGPCNT']"
data-type="number" />

Related

Is the use of logical comparisons quicker than axes in XPath?

I have the an XML document that will balloon in size as time goes on and I would like to ensure that my XPath choice for an XSL select will be as efficient as possible.
The document contains the following types of elements:
<simple_instance>
<name>Class0</name>
<type>Business_Capability</type>
<own_slot_value>
<slot_reference>contained_business_capabilities</slot_reference>
<value value_type="simple_instance">Class1</value>
<value value_type="simple_instance">Class3</value>
<value value_type="simple_instance">Class4</value>
<value value_type="simple_instance">Class5</value>
</own_slot_value>
<own_slot_value>
<slot_reference>business_capability_level</slot_reference>
<value value_type="string">1</value>
</own_slot_value>
<own_slot_value>
<slot_reference>name</slot_reference>
<value value_type="string">Planning</value>
</own_slot_value>
</simple_instance>
Which of these two selectors (which find elements like the one above) will be more efficient in the long run?
/node()/simple_instance[type='Business_Capability']/own_slot_value/slot_reference[text()='business_capability_level']/following-sibling::value[text()='1']
or
/node()/simple_instance[type='Business_Capability' and (own_slot_value/slot_reference='business_capability_level') and (own_slot_value/value='1')]
My guess is that, if the implementation of XML short-circuits the and, the latter will be quicker.
Note: I'm using Protege's XML/XSL capabilities.
The two XPath expressions have different results, so asking which is faster seems irrelevant (the first selects a value element, the second a simple_instance element).
In addition, XPath is a specification not an implementation. Implementations differ widely in their strategies for evaluating complex paths. An answer that is true for one implementation may well not be true for another. Measure it and see (and tell us the answer).

Xpath - How to select subnode where sibling-node contains certain text

I want to use XPath to select the sub tree containing the <name>-tag with "ABC" and not the other one from the following xml. Is this possible? And as a minor question, which keywords would I use to find something like that over Google (e.g. for selecting the sub tree by an attribute I would have the terminology for)?
<root>
<operation>
<name>ABC</name>
<description>Description 1</description>
</operation>
<operation>
<name>DEF</name>
<description>Description 2</description>
</operation>
</root>
Use:
/*/operation[name='ABC']
For your second question: I strongly recommend not to rely on online sources (there are some that aren't so good) but to read a good book on XPath.
See some resources listed here:
https://stackoverflow.com/questions/339930/any-good-xslt-tutorial-book-blog-site-online/341589#341589
For your first question, I think a more accurate way to do it would be://operation[./name[text()='ABC']].And according to this , we can also make it://operation[./name[text()[.='ABC']]]

Wrapping an XML element with its ancestor nodes/tags

I can't navigate the XML doc programmatically and I need an one-line XPath solution for reasons I describe at the end.
I am working with an XML schema that looks something like the one below. (This is something I have to use as-is.)
<Root>
<!-- Child 1 -->
<Child>
<Name>Joe</Name>
<Age>12</Age>
</Child>
<!-- Child 2 -->
<Child>
<Name>Mike</Name>
<Age>25</Age>
</Child>
<!-- Child 3 -->
<Child>
<Name>Jane</Name>
<Age>20</Age>
</Child>
</Root>
Assuming I'm already at the "Joe" node (i.e. the Name element inside Child 1), I need to define an XPath query that will "wrap" that node as follows:
<Root>
<!-- Child 1 -->
<Child>
<Name>Joe</Name>
<Age>12</Age>
</Child>
</Root>
I've tried various combinations of ancestor, string-join, concat, etc., but can't seem to find the solution that "wraps" the element correctly. (The way I was using ancestor was returning all Child nodes, for example, which is not what I need.)
Some other considerations:
The solution has to be a one-line XPath query, if that's possible (for reasons given below).
It has to be generic enough to work for any Child element (i.e., it can't assume that I'm always at the first or second or third child, for example).
From the example above, you can see that I don't actually need the actual Root node per-se, just its tag (i.e. I don't want all Child nodes under it). However, I do need the actual Child node (so that I get the Name and Age).
NOTE: For what it's worth, I can't actually navigate the XML programmatically. I am using a library (whose code I cannot change) in which I have to define everything in terms of one-line XPath queries within a configuration file. It will essentially navigate through all of the Name elements, so my solution has to work from that point.
XPath is a query language.
This, among other things means that the evaluation of an XPath expression never modifies the XML document.
So, the answer is: Modifying an XML document or creating a new document cannot be done using only XPath.
Such transformations are very easy and natural to specify with XSLT.

if I select nodes from an XDocument is the order always preserved?

Let's say I have nodes like so:
<Params>
<Param val="C" />
<Param val="D" />
<Param val="A" />
<Param val="B" />
<Params>
If I select the Descendants of Params is the order always preserved? I want C to always be first when I iterate through the ordered list that I'll be dropping these into. Or do I need to come up with a different solution for ordering nodes? I'd like to stay away from numbers (order="1", 2 etc.) so any suggestions would be great.
The documentation for the Descendants property says:
Returns a collection of the descendant elements for this document or element, in document order.
So the answer is yes, they will be returned in the same order they appear in the original XML.

Find the maximum child count with XPath 1.0

Can I find one XML node with the most children with XPath?
<xml>
<node id="1">
<child />
<node>
<node id="2">
<child /><child />
<node>
<node id="3">
<child /><child />
<node>
<node id="4">
<child /><child /><child />
<node>
<node id="5">
<child /><child /><child />
<node>
</xml>
I would like to select either node 4 or node 5 with a single, pure XPath 1.0 expression.
I know this is pretty old, but if it helps anyone out, I wanted to do this and I think this works, at least it does for me:
/xml/node[count(./child) > count(following-sibling::node/child) and count(./child) > count(preceding-sibling::node/child)]
I'm not great with Xpath so maybe I'm missing something.
I think that it is impossible because to count children you need function count() which has one parameter - node-set and returns count of elements in this set. So you have no option how to count more node-sets than one to get max value.
Note: I am talking about XPath 1.0
I also don't think this is possible (based on the fact that I haven't been able to do it :)). Of course, if you're allowed to change the xml (even just temporarily during this processing), you could update it to put the child count as an attribute on the node (or as the node value itself), after which it's easy:
/xml/node[not(../node/#childCount > ./#childCount)]
or
/xml/node[not(../node > .)]
But you probably already know that.
The other thing I thought might work was to do some clever maths along pigeon-hole principle lines, to take as inputs the total child count and the number of nodes, and produce a minimum child count that the max-node must have, and then doing
/xml/node[child[position()=formula_for_magic_number_goes_here]]
but I soon realised that I couldn't come up with such a formula that would correctly deal with all cases - for example, if there were 10 nodes with child counts of 10, 99 1, 1, (and the rest 1s too), no amount of manipulation of the numbers 27 and 10 is going to produce a cut off point that includes 10 but excludes 9.

Resources