Xpath count function clarification - xpath

I would like to know why the following Xpath expression is give count as 2, instead of 3. Thanks for your help.
Xpath-
<xsl:value-of select="count(//x[1]/y[1])"/>
xml
<?xml version="1.0"?>
<test>
<x a="1">
<x a="2">
<x>
<y>y31</y>
<y>y32</y>
</x>
</x>
</x>
<x a="1">
<x a="2">
<y>y21</y>
<y>y22</y>
</x>
</x>
<x a="1">
<y>y11</y>
<y>y12</y>
</x>
<x>
<y>y03</y>
<y>y04</y>
</x>
</test>
//count (//x[1]/y[1]) is selecting the following 2 elements.
1) <x>
<y>y31</y>
2) <x a="2">
<y>y21</y>
and it is not selecting one of following elements in the same level, to add count as 3. I would like to clarify this.
<x a="1">
<y>y11</y>
or
<x>
<y>y03</y>
thanks,
Mathew

//x[1]/y[1] selects y elements that are 1st children of x elements that are 1 children or their parent.
so 1st child of test
<x a="1">
<x a="2"> <!-- x is first child of its parent -->
<y>y21</y> <!-- y is first child of its parent x -->
<y>y22</y>
</x>
</x>
and 2nd child or test
<x a="1">
<x a="2">
<x> <!-- x is first child of its parent -->
<y>y31</y> <!-- y is first child of its parent x -->
<y>y32</y>
</x>
</x>
</x>
both are or contain a x element that itself has an x element as 1st child with y as 1st child, that is therefore matched.
but
<x a="1"> <!-- x is 3rd child of its parent test -->
<y>y11</y>
...
is the 3rd child of test
<x> <!-- x is 4th child of its parent test -->
<y>y03</y>
is the 4th child of test
so they won't match

Related

Why is xsl:value-of behaving completely different depending on the xsl:stylesheet version?

Looking at this XML:
<?xml version="1.0" encoding="UTF-8"?>
<root>
root
<child>
child 1
<grandchild>
grandchild 1
</grandchild>
<yetanothergrandchild>
yetanothergrandchild 1
</yetanothergrandchild>
</child>
<child>
child 2
<grandchild>
grandchild 2
</grandchild>
<yetanothergrandchild>
yetanothergrandchild 2
</yetanothergrandchild>
</child>
</root>
and that XSL
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fo="http://www.w3.org/1999/XSL/Format">
<xsl:output media-type="text" omit-xml-declaration="yes"/>
<xsl:template match="/">
<fo:root>
<fo:layout-master-set>
<fo:simple-page-master master-name="simple"
page-height="29.7cm"
page-width="21cm"
margin-top="1cm"
margin-bottom="2cm"
margin-left="2.5cm"
margin-right="2.5cm">
<fo:region-body margin-top="3cm"/>
<fo:region-before extent="3cm"/>
<fo:region-after extent="1.5cm"/>
</fo:simple-page-master>
</fo:layout-master-set>
<fo:page-sequence master-reference="simple">
<fo:flow flow-name="xsl-region-body">
<fo:block font-size="12pt"
font-family="sans-serif"
line-height="15pt"
space-after.optimum="3pt"
text-align="justify">
<xsl:value-of select="root/child/grandchild"/>
<xsl:value-of select="root/child/yetanothergrandchild"/>
</fo:block>
</fo:flow>
</fo:page-sequence>
</fo:root>
</xsl:template>
</xsl:stylesheet>
If I put the xsl:stylesheet version to 1.0, the output is:
grandchild 1 yetanothergrandchild 1
If I put it to 2.0, the output is:
grandchild 1 grandchild 2 yetanothergrandchild 1 yetanothergrandchild 2
Of course, I read already through various lists of differences in between XSL T 1 and 2 but I cannot find any hint of a change which could cause that.
Can somebody tell me how and why that behaves that differently?
See https://www.w3.org/TR/xslt20/#backwards and then https://www.w3.org/TR/xslt20/#incompatibilities saying
J.1.3 Backwards Compatibility Behavior Some XSLT constructs behave
differently under XSLT 2.0 depending on whether backwards compatible
behavior is enabled. In these cases, the behavior may be made
compatible with XSLT 1.0 by ensuring that backwards compatible
behavior is enabled (which is done using the [xsl:]version attribute).
These constructs are as follows:
If the xsl:value-of instruction has no separator attribute, and the
value of the select expression is a sequence of more than one item,
then under XSLT 2.0 all items in the sequence will be output, space
separated, while in XSLT 1.0, all items after the first will be
discarded.
...
In XSLT 1.0 the xsl:value-of instruction returns the string-value of the first node in the selected node-set.
In XSLT 2.0 the instruction returns the value of every node in the selected sequence, separated by a space or by the string specified in the separator attribute.
These are my formulations, the specs are more difficult to follow.

How to replace 1st node attribute value in xml using xpath

In the below XML, need to replace the namespace by using XPath.
<application xmlns="http://ns.adobe.com/air/application/4.0">
<child id="1"></child>
<child id="2"></child>
</application>
I tried with
/application/#xmlns
and
/*[local-name()='application']/#[local-name()='xmlns']
Both failed to give the desire output. To replace the text, I have used xmltask replace.
<xmltask source="${temp.file1}" dest="${temp.file1}">
<replace path="/application/#xmlns" withText="http://ns.adobe.com/air/application/16.0" />
</xmltask>
The problem is that xmlns is not an attribute. You cannot select it with XPath.
A namespace is part of the node name in XML: <foo xmlns="urn:foo-namespace" /> and <foo xmlns="urn:bar-namespace" /> are not two nodes with the same name and different attributes, they are two nodes with different names and no attributes.
If you want to change a namespace, you must construct a completely new node.
XSLT is better-suited to this task:
<!-- update-air-ns.xsl -->
<xsl:transform
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:air4="http://ns.adobe.com/air/application/4.0"
xmlns="http://ns.adobe.com/air/application/16.0"
>
<xsl:output method="xml" encoding="UTF-8" indent="yes" />
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="air4:*">
<xsl:element name="{local-name()}">
<xsl:apply-templates select="#*|node()"/>
</xsl:element>
</xsl:template>
</xsl:transform>
This XSLT transformation does two things:
the first template (identity template) copies nodes recursively, unless there is a better matching template for a given node
the second template matches elements in the air4 namespace and constructs new elements that have the same local name but a different namespace. This happens because of the default namespace declaration in the XSLT. The http://ns.adobe.com/air/application/16.0 namespace is used for all newly constructed elements.
Applied to your input XML, the result is
<application xmlns="http://ns.adobe.com/air/application/16.0">
<child id="1"/>
<child id="2"/>
</application>
You can use Ant's xslt task:
<xslt in="${temp.file1}" out="${temp.file1}" style="update-air-ns.xsl" />

XPath to find element with a HTML line break

I need an xpath that will find some text containing HTML line breaks <br/>. For example:
<ul>
<li>ABC<br/>DEF</li>
<li>XYZ<br/>NOP</li>
</ul>
Let's say I'm trying to find the li that contains ABC<br/><DEF>. I've tried the following:
$x("//li[normalize-space(.)='ABC DEF']")
$x("//li[text() ='ABC<br/>DEF']")
$x("//li[contains(., 'ABC DEF']")
But they return nothing. I saw this answer XPath contains(text(),'some string') doesn't work when used with node with more than one Text subnode but I couldn't figure out how to use it in my case.
The following expression will get you close:
li[br[preceding-sibling::node()[1] = 'ABC']
[starts-with(following-sibling::node()[1], 'DEF')]]
If you need to match only items where the text ends with ABC, it will be a little longer.
The following transform will select the first matching li:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" />
<xsl:template match="/">
<matches>
<xsl:copy-of select="(//li[br[preceding-sibling::node()[1] = 'ABC']
[starts-with(following-sibling::node()[1], 'DEF')]
])
[1]" />
</matches>
</xsl:template>
</xsl:stylesheet>
Input:
<ul>
<li>ABC<br/>DEF</li>
<li>XYZ<br/>NOP</li>
<li><p>XYZ<br/>NOP</p></li>
<li>ABC<br/>DEF</li>
<li>DEF GHI</li>
<li>ABC<![CDATA[<br/>]]>DEF</li>
</ul>
Output:
<?xml version="1.0" encoding="utf-8"?>
<matches>
<li>ABC<br />DEF</li>
</matches>
//li[br]
This should work. It means: select all li elements having br child

how can I get a list of indexes of nodes that have a value using xpath

using the following;
<a>
<b>false</b>
<b>true</b>
<b>false</b>
<b>false</b>
<b>true</b>
</a>
I want to get the following result using something like
/a/b[.='true'].position()
for a result like
2,5 (as in a collection of the 2 positions)
I. XPath 1.0 solution:
Use:
count(/*/*[.='true'][1]/preceding-sibling::*)+1
This produces the position of the first b element whose string value is "true":
2
Repeat the evaluation of a similar expression, where [1] is replaced by [2] ,..., etc, up to count(/*/*[.='true'])
XSLT - based verification:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:for-each select="/*/*[.='true']">
<xsl:variable name="vPos" select="position()"/>
<xsl:value-of select=
"count(/*/*[.='true'][$vPos]
/preceding-sibling::*) +1"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the provided XML document:
<a>
<b>false</b>
<b>true</b>
<b>false</b>
<b>false</b>
<b>true</b>
</a>
The XPath expression is constructed and evaluated for everyb, whose string value is"true". The results of these evaluations are copied to the output:
2
5
II. XPath 2.0 solution:
Use:
index-of(/*/*, 'true')
XSLT 2.0 - based verification:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:sequence select="index-of(/*/*, 'true')"/>
</xsl:template>
</xsl:stylesheet>
When this XSLT 2.0 transformation is applied on the same XML document (above), the XPath 2.0 expression is evaluated and the result of this evaluation is copied to the output:
2 5
A basic (& working) approach in python language :
from lxml import etree
root = etree.XML("""
<a>
<b>false</b>
<b>true</b>
<b>false</b>
<b>false</b>
<b>true</b>
</a>
""")
c = 0
lst = []
for i in root.xpath('/a/b/text()'):
c+=1
if i == 'true':
lst.append(str(c))
print ",".join(lst)

Xpath query to select all elements ignoring descendants based on criteria

I am trying to select all the <set>erase</set> elements such that if two or more elements have the <set>erase</set> in hierarchy (Ex: <b> and <d> both have <set>erase</set>) then only the element in parent node name has to be selected(ie <b> in this case).
Sample xml below:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<a>
<b>
<set>erase</set>
<d>
<set>erase</set>
</d>
</b>
<c>
<x></x>
</c>
<e>
<y>
<set>erase</set>
<q></q>
</y>
<z>
<p>
<set>erase</set>
</p>
</z>
</e>
</a>
When I use the query = //set[contains(.,'erase')] I get all <set>erase</set> of nodesList b,d,y,p in result set.
I need help in framing the query to select <set>erase</set> of b , y and p.
Here is the same solution:
One XPath expression that selects exactly the wanted elements is:
//*[set[. = 'erase' and not(node()[2])]
and
not(ancestor::*
[set
[. = 'erase' and not(node()[2])]
]
)
]
XSLT - based verification:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:for-each select=
"//*[set[. = 'erase' and not(node()[2])]
and
not(ancestor::*
[set
[. = 'erase' and not(node()[2])]
]
)
]">
<xsl:value-of select="name()"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
when this transformation is applied on the provided XML document:
<a>
<b>
<set>erase</set>
<d>
<set>erase</set>
</d>
</b>
<c>
<x></x>
</c>
<e>
<y>
<set>erase</set>
<q></q>
</y>
<z>
<p>
<set>erase</set>
</p>
</z>
</e>
</a>
The contained XPath expression is evaluated and the names of the selected elements are output -- correctly and as expected:
b
y
p
If you need to select the set children of the selected above elements, just append the above XPath expression with /set:
//*[set[. = 'erase' and not(node()[2])]
and
not(ancestor::*
[set
[. = 'erase' and not(node()[2])]
]
)
]
/set
Again, XSLT - based verification:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:copy-of select=
"//*[set[. = 'erase' and not(node()[2])]
and
not(ancestor::*
[set
[. = 'erase' and not(node()[2])]
]
)
]
/set
"/>
</xsl:template>
</xsl:stylesheet>
This transformation just evaluates the above XPath expression and copies to the output the correctly selected three set elements:
<set>erase</set>
<set>erase</set>
<set>erase</set>

Resources