Find the maximum child count with XPath 1.0 - xpath

Can I find one XML node with the most children with XPath?
<xml>
<node id="1">
<child />
<node>
<node id="2">
<child /><child />
<node>
<node id="3">
<child /><child />
<node>
<node id="4">
<child /><child /><child />
<node>
<node id="5">
<child /><child /><child />
<node>
</xml>
I would like to select either node 4 or node 5 with a single, pure XPath 1.0 expression.

I know this is pretty old, but if it helps anyone out, I wanted to do this and I think this works, at least it does for me:
/xml/node[count(./child) > count(following-sibling::node/child) and count(./child) > count(preceding-sibling::node/child)]
I'm not great with Xpath so maybe I'm missing something.

I think that it is impossible because to count children you need function count() which has one parameter - node-set and returns count of elements in this set. So you have no option how to count more node-sets than one to get max value.
Note: I am talking about XPath 1.0

I also don't think this is possible (based on the fact that I haven't been able to do it :)). Of course, if you're allowed to change the xml (even just temporarily during this processing), you could update it to put the child count as an attribute on the node (or as the node value itself), after which it's easy:
/xml/node[not(../node/#childCount > ./#childCount)]
or
/xml/node[not(../node > .)]
But you probably already know that.
The other thing I thought might work was to do some clever maths along pigeon-hole principle lines, to take as inputs the total child count and the number of nodes, and produce a minimum child count that the max-node must have, and then doing
/xml/node[child[position()=formula_for_magic_number_goes_here]]
but I soon realised that I couldn't come up with such a formula that would correctly deal with all cases - for example, if there were 10 nodes with child counts of 10, 99 1, 1, (and the rest 1s too), no amount of manipulation of the numbers 27 and 10 is going to produce a cut off point that includes 10 but excludes 9.

Related

XSLT Sort parent node based on specific attribute of a child

first post ever, have done lots of searching but cannot find an answer specific enough or more importantly, relevant enough. Note that I am a business analyst, not a developer, so I may be missing some understanding here.
We produce XML that we then process to produce a report. Where data can be represented by a table, the XML contains details for the table title (ELEMENT_HEADING), table header row (PROMPTS), then repeating nodes representing the rows (DATA) and columns (VALUES).
The problem I am facing is I need to sort the DATA node based on a text value of the node where the node has a specific attribute value.
In the sample XML provided below, I need to sort the DATA node based on the VALUE text value for the attribute #pic='TRORGPCNT' in ascending order i.e the DATA node with TRORGPCNT of 10 should appear before the DATA node with 90. Then when the report is produced the table rows are in ascending percentage order.
I hope have explained myself clearly enough :)
Any tips on how I might accomplish this?
Sample XML:
<PROPOSAL_ELEMENT multi="Y" pec="TEACHRESP" elem_mandatory="N" elem_visible="Y">
<ELEMENT_HEADING pec="TEACHRESP">Teaching Responsibility</ELEMENT_HEADING>
<PROMPTS>
<PROMPT pic="TRORGUN" item_mandatory="Y" item_visible="Y">Faculty or School with teaching responsibility</PROMPT>
<PROMPT pic="TRORGPCNT" item_mandatory="Y" item_visible="Y">Teaching responsibility %</PROMPT>
</PROMPTS>
<DATA elem_mandatory="N" elem_visible="Y" delete_ind="N">
<VALUES>
<VALUE pic="TRORGUN" item_mandatory="Y" item_visible="Y" item_description="FACULTY OF NURSING AND HEALTH" display_in_summary_tab="Y" summary_order="">FACULTY OF NURSING AND HEALTH</VALUE>
<VALUE pic="TRORGPCNT" item_mandatory="Y" item_visible="Y" item_description="" display_in_summary_tab="Y" summary_order="">90</VALUE>
</VALUES>
</DATA>
<DATA elem_mandatory="N" elem_visible="Y" delete_ind="N">
<VALUES>
<VALUE pic="TRORGUN" item_mandatory="Y" item_visible="Y" item_description="FACULTY OF ARTS" display_in_summary_tab="Y" summary_order="">FACULTY OF ARTS</VALUE>
<VALUE pic="TRORGPCNT" item_mandatory="Y" item_visible="Y" item_description="" display_in_summary_tab="Y" summary_order="">10</VALUE>
</VALUES>
</DATA>
Sorting in XSLT is accomplished using the xsl:sort instruction, which must appear as the first child of the for-each or apply-templates that selects the nodes you want to sort. If you're selecting the set of DATA element nodes then an appropriate sorting instruction would be
<xsl:sort select="VALUES/VALUE[#pic='TRORGPCNT']"
data-type="number" />

Looking for n-th instance of x node in root node

Suppose I have following xml
<root>
<x>
<y />
<z>
<y />
</z>
<n>
<m>
<y />*
</m>
</n>
</x>
<x>
<y />
<z>
<y />
</z>
<y />*
</x>
</root>
I would like to retrieve those y nodes which are followed with *
So it is always third node in x ancestor node
I tried something like:
//x//y[3]
However it doesn't work I guess it would work only if y nodes are on the same level.
So I tried
(//x//y)[3] but it retrieves only one node (third one) in whole document
So I tried something like:
//x(//y)[3]
//x(//y[3])
//x//(y[3])
etc. but I get parse error
Is there any way to retrieve what I need using xpath?
Use:
//x/descendant::y[3]
This selects every third y descendant of each x in the document. It sometimes helps to write out an expanded expression to see what's really going on. In this case, the following:
//x//y[3]
is equivalent to:
/descendant-or-self::node()/child::x/descendant-or-self::node()/child::y[3]
Written this way it becomes obvious why it doesn't do what you wanted (i.e. it's looking for any y that is the third child of an x element and there isn't one). What you really wanted was every third y descendant. Here it is fully expanded:
/descendant-or-self::node()/child::x/descendant::y[3]
The important lesson here is that it pays to know what the XPath abbreviated syntax is really doing. The spec is actually quite readable. I recommend taking a look.
Update: both of these examples are XPath 2.0 only.
In XPath 1.0:
/row//y/(ancestor::x//y)[3]
In XPath 2.0:
for $x in /row//x
return ($x//y)[3]

XPath with MSXML, "scope" of XPath-expressions

i have an understanding-problem using Microsoft XML Core Services 6.0 (MSXML) with XPath-expressions.
I´ve broken down the problem to the most simple case. So let´s take the following XML-File:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<element name="E1A1">
<subEle value="1a"/>
<subEle value="1b"/>
<subEle value="1c"/>
</element>
<element name="E2A1">
<subEle value="2a"/>
<subEle value="2b"/>
<subEle value="3b"/>
</element>
<element name="E3A1">
<subEle value="3a"/>
<subEle value="3b"/>
<subEle value="3c"/>
</element>
</root>
I want to get the "value"-attribues per "element". I will use pseudo-code to describe
my problem and i will focus on the important things, so i will not write how i initialize
the Msxml2.DOMDocument variable etc. .
First, i get all "element"-nodes that have a name-attribute:
oNodeList = oDom.selectNodes("//element[#name]")
The result of the selectNodes-statement is a nodelist, where i access the items node by node
in a for-loop. In this loop, i execute another selectNodes-statement, that gives me (at least i thought so)
the "subEle"s for each "element":
for i from 1 to oNodeList.length
oNodeMain = oNodeList.nextNode()
oNodeResList = oNodeMain.selectNodes("//subEle")
msgInfo("n items", oNodeResList.length)
endFor
And here comes the problem: the selectNodes-statement in the loops seems to have ALL "subEle"s
in scope; the messagebox pops up three times, telling me the length of the nodelist is 9.
I would have expected that it pops up 3 times, telling me each time that the nodelist has a length of 3 (because
every "element" has exactly 3 "subEle"s), since i´m doing the selectNodes-statement on "oNodeMain",
which gets the nextNode in each loop. Maybe i just need to modify XPath-expression in the loop and
don´t use the "//", because it works then, but i have no idea why.
The program i use for this is Paradox 11 and i use MSXML by OLE.
Is this behaviour "normal", where is my misunderstanding? Any suggestions on how to achieve what i´m
trying are welcome.
Don't use an absolute path starting with /, instead use a relative path i.e. oNodeMain.selectNodes("subEle") selects all subEle child elements of oNodeMain and oNodeMain.selectNodes(".//subEle") selects all descendant subEle elements of oNodeMain.
Your path starting with // searches from the root node (also called document node).

Wrapping an XML element with its ancestor nodes/tags

I can't navigate the XML doc programmatically and I need an one-line XPath solution for reasons I describe at the end.
I am working with an XML schema that looks something like the one below. (This is something I have to use as-is.)
<Root>
<!-- Child 1 -->
<Child>
<Name>Joe</Name>
<Age>12</Age>
</Child>
<!-- Child 2 -->
<Child>
<Name>Mike</Name>
<Age>25</Age>
</Child>
<!-- Child 3 -->
<Child>
<Name>Jane</Name>
<Age>20</Age>
</Child>
</Root>
Assuming I'm already at the "Joe" node (i.e. the Name element inside Child 1), I need to define an XPath query that will "wrap" that node as follows:
<Root>
<!-- Child 1 -->
<Child>
<Name>Joe</Name>
<Age>12</Age>
</Child>
</Root>
I've tried various combinations of ancestor, string-join, concat, etc., but can't seem to find the solution that "wraps" the element correctly. (The way I was using ancestor was returning all Child nodes, for example, which is not what I need.)
Some other considerations:
The solution has to be a one-line XPath query, if that's possible (for reasons given below).
It has to be generic enough to work for any Child element (i.e., it can't assume that I'm always at the first or second or third child, for example).
From the example above, you can see that I don't actually need the actual Root node per-se, just its tag (i.e. I don't want all Child nodes under it). However, I do need the actual Child node (so that I get the Name and Age).
NOTE: For what it's worth, I can't actually navigate the XML programmatically. I am using a library (whose code I cannot change) in which I have to define everything in terms of one-line XPath queries within a configuration file. It will essentially navigate through all of the Name elements, so my solution has to work from that point.
XPath is a query language.
This, among other things means that the evaluation of an XPath expression never modifies the XML document.
So, the answer is: Modifying an XML document or creating a new document cannot be done using only XPath.
Such transformations are very easy and natural to specify with XSLT.

Select a only first matching node in XPath

I have the following XML:
<parent>
<pet>
<data>
<birthday/>
</data>
</pet>
<pet>
<data>
<birthday/>
</data>
</pet>
</parent>
And now I want to select the first birthday element via parent//birthday[1] but this returns both birthday elements because bothof them are the first child of their parents.
How can I only select the first birthday element of the entire document no matter where it is located. I've tried parent//birthday[position()=1] but that doesn't work either.
You mean (note the parentheses!)
(/parent/pet/data/birthday)[1]
or, a shorter, but less specific variation:
(/*/*/*/birthday)[1]
(//birthday)[1]
or, more semantic, the "birthday of the first pet":
/parent/pet[1]/data/birthday
or, if not all pets have birthday entries, the "birthday of the first pet that for which a birthday is set":
/parent/pet[data/birthday][1]/data/birthday
If you work from a context node, you can abbreviate the expression by making it relative to that context node.
Explanation:
/parent/pet/data/birthday[1] selects all <birthday> nodes that are the first in their respective parents (the <data> nodes), throughout the document
(/parent/pet/data/birthday)[1] selects all <birthday> nodes, and of those (that's what the parentheses do, they create an intermediary node-set), it takes the first one
FYI: you can visualize the results of the various Xpath queries with the (free) XPathVisualizer tool. Works on Windows only.
Ok, I admit this is horrendous and there must be a better way, but it appears to work.
/*/*[descendant::birthday and not(preceding-sibling::*[descendant::birthday])]
I look for all elements at the second level in the tree that have a descendant element called birthday that do not have a preceding sibling element that has a birthday element as a descendant.
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:variable name="birthdays" select="//birthday"/>
<xsl:value-of select="$birthdays[1]"/>
</xsl:template>
</xsl:stylesheet>
try
//birthday[position()=1]
// finds nodes no matter where there are in the hierarchy
you could also do
pet[position()=1]/data/birthday

Resources