Position predicates with xmerl_xpath in Erlang - xpath

I'm trying to use xmerl_xpath to query a parsed XML document in Erlang, but I can't get the position predicates to work. Instead of returning the nth child element, I get the nth element of everything selected so far.
Sample code, where I'd like to extract the values 11 and 21 (the "first column"):
{XML,_} = xmerl_scan:string(
"<table>" ++
"<row><el>11</el><el>12</el></row>" ++
"<row><el>21</el><el>22</el></row>" ++
"</table>" ).
4 = length(xmerl_xpath:string( "//table/row/el", XML )). % OK
1 = length(xmerl_xpath:string( "(//table/row/el)[1]", XML )). % OK
1 = length(xmerl_xpath:string( "//table/row/el[1]", XML )). % Why not 2?
Is the result of the last query expected? What's the proper way, in the general case, to extract the nth child using xmerl_path?
(What I'm really trying to do is to parse HTML using mochiweb_html and query it using mochiweb_xpath, but the latter is essentially a wrapper around xmerl_xpath.)

I don't know Erlang, but the third XPath should indeed extract the first column - the two nodes <el>11</el> and <el>21</el> - I create a simple test XSLT:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<root>
<xsl:for-each select="//table/row/el[1]">
<xsl:copy-of select="."/>
</xsl:for-each>
</root>
</xsl:template>
</xsl:stylesheet>
that applied to
<table>
<row>
<el>11</el>
<el>12</el>
</row>
<row>
<el>21</el>
<el>22</el>
</row>
</table>
produces:
<root>
<el>11</el>
<el>21</el>
</root>
If this is not what you are getting I suspect that the library you are using is buggy.

Related

XSLT Function Return Type

Originally: **How to apply XPath query to a XML variable typed as element()* **
I wish to apply XPath queries to a variable passed to a function in XSLT 2.0.
Saxon returns this error:
Type error at char 6 in xsl:value-of/#select on line 13 column 50 of stackoverflow_test.xslt:
XTTE0780: Required item type of result of call to f:test is element(); supplied value has item type text()
This skeleton of a program is simplified but, by the end of its development, it is meant to pass an element tree to multiple XSLT functions. Each function will extract certain statistics and create reports from the tree.
When I say apply XPath queries, I mean I wish to have the query consider the base element in the variable... if you please... as if I could write {count(doc("My XSLT tree/element variable")/a[1])}.
Using Saxon HE 9.7.0.5.
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:f="f:f">
<xsl:template match="/root">
<xsl:variable name="first" as="element()*">
<xsl:copy-of select="(./a[1])" />
</xsl:variable>
<html>
<xsl:copy-of select="f:test($first)" />
</html>
</xsl:template>
<xsl:function name="f:test" as="element()*">
<xsl:param name="frstElem" as="element()*" />
<xsl:value-of select="count($frstElem/a)" />
<!-- or any XPath expression -->
</xsl:function>
</xsl:stylesheet>
Some example data
<root>
<a>
<b>
<c>hi</c>
</b>
</a>
<a>
<b>
<c>hi</c>
</b>
</a>
</root>
Possibly related question: How to apply xpath in xsl:param on xml passed as input to xml
What you are doing is perfectly correct, except that you have passed an a element to the function, and the function is looking for an a child of this element, and with your sample data this will return an empty sequence.
If you want f:test() to return the number of a elements in the sequence that is the value of $frstElem, you can use something like
<xsl:value-of select="count($frstElem/self::a)" />
instead of using the (implicit) child:: axis.

XPath: Get text that contains Obama but not Romney

I am quite new to XPath so bear with me. I have a XPath expression
'.//*[contains(.,"Obama")]/text()'
that gets me the text that contains "Obama". However, I haven't been able to figure out how to add
and [not(contains(., "Romney"))] to the expression without getting a syntax error. How is it done? Help much appriciated!
Use:
.//*[contains(.,"Obama") and not(contains(.,"Romney"))]/text()
XSLT - based verification:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:copy-of select=
'.//*[contains(.,"Obama") and not(contains(.,"Romney"))]/text()'/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the following XML document:
<election>
<choice>Maybe Obama</choice>
<choice>Maybe Romney</choice>
</election>
the XPath expression is evaluated and the selected node is copied to the output:
Maybe Obama
Do note:
SomeExpression[x][y]
is not always equivalent to:
SomeExpression[x and y]
Therefore, it is recommended the latter -- not the former, as specified in the answer by #ChrisGerken.
Here is a concrete example:
Let's have this XML document:
<nums>
<num>01</num>
<num>02</num>
<num>03</num>
<num>04</num>
<num>05</num>
<num>06</num>
<num>07</num>
<num>08</num>
<num>09</num>
<num>10</num>
</nums>
and these two XPath expressions:
/*/*[. mod 3 = 0 and position() = 3]
and
/*/*[. mod 3 = 0][position() = 3]
The first expression selects:
<num>03</num>
However, the second expression selects:
<num>09</num>
And here is a complete XSLT - based verification:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:copy-of select=
"/*/*[. mod 3 = 0 and position() = 3]"/>
================
<xsl:copy-of select=
"/*/*[. mod 3 = 0][position() = 3]"/>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the above XML document, the two XPath expressions are evaluated and the results of these evaluations are copied to the output:
<num>03</num>
================
<num>09</num>
Explanation:
position() is a *context-sensitive` function and typically produces different results when used in the k-th and in the m-th predicate, where k != m
try this:
'.//*[contains(.,"Obama")][not(contains(.,"Romney"))]/text()'
You can put as many predicates as you like one after another:
[a][b][c]

xsl sorting on an an average value of 3 child elements

I have the following xml. What I want to do with my XSL is sort the output on the total value of the elements productDesignRating, productPriceRating and productPerfromanceRating. So far no luck in trying to get this done. Any help will be appreciated i need to be able to do this in xsl 1 so no xsl2 functions.
<DocumentElement xmlns="DotNetNuke/UserDefinedTable">
<QueryResults>
<productCategory>cat1</productCategory>
<productTitle>product1</productTitle>
<productImage><img alt="productImage" title="productImage" src="/skinconversion/Portals/12/babynokiko.jpg" /></productImage>
<productDesignRating>3</productDesignRating>
<productPriceRating>4</productPriceRating>
<productPerformanceRating>4</productPerformanceRating>
<productPrice>10</productPrice>
<productSummary>description</productSummary>
<productUrl>http://www.2dnn.com</productUrl>
</QueryResults>
<QueryResults>
<productCategory>cat2</productCategory>
<productTitle>product2</productTitle>
<productImage><img alt="productImage" title="productImage" src="/skinconversion/Portals/12/babynokiko.jpg" /></productImage>
<productDesignRating>3</productDesignRating>
<productPriceRating>3</productPriceRating>
<productPerformanceRating>3</productPerformanceRating>
<productPrice>10</productPrice>
<productSummary>description</productSummary>
<productUrl>http://www.2dnn.com</productUrl>
</QueryResults>
<QueryResults>
<productCategory>cat3</productCategory>
<productTitle>product3</productTitle>
<productImage><img alt="productImage" title="productImage" src="/skinconversion/Portals/12/babynokiko.jpg" /></productImage>
<productDesignRating>1</productDesignRating>
<productPriceRating>2</productPriceRating>
<productPerformanceRating>3</productPerformanceRating>
<productPrice>56</productPrice>
<productSummary>description</productSummary>
<productUrl>http://www.2dnn.com</productUrl>
</QueryResults>
</DocumentElement>
Try this:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:my="DotNetNuke/UserDefinedTable">
<xsl:template match="node()">
<xsl:copy>
<xsl:apply-templates select="node()">
<xsl:sort select="my:productDesignRating +
my:productPriceRating +
my:productPerformanceRating"
data-type="number"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
This is just a simplified identity template (not processing any attributes), but applying a numerical sort on the sum of the values of the 3 specified child elements – if they are present.
Specifically note the use of the data-type attribute, allowing to specify that the sorting base should be numbers, so the order here is 6,9,11 (for strings, which is default, it would be 11,6,9)…
[To reverse the order, simply add the order="descending" attribute]

XPath to return default value if node not present

Say I have a pair of XML documents
<Foo>
<Bar/>
<Baz>mystring</Baz>
</Foo>
and
<Foo>
<Bar/>
</Foo>
I want an XPath (Version 1.0 only) that returns "mystring" for the first document and "not-found" for the second. I tried
(string('not-found') | //Baz)[last()]
but the left hand side of the union isn't a node-set
In XPath 1.0, use:
concat(/Foo/Baz,
substring('not-found', 1 div not(/Foo/Baz)))
If you want to handle the posible empty Baz element, use:
concat(/Foo/Baz,
substring('not-found', 1 div not(/Foo/Baz[node()])))
With this input:
<Foo>
<Baz/>
</Foo>
Result: not-found string data type.
Special case:
If you want to get 0 if numeric node is missing or empty, use sum(/Foo/Baz) function
#Alejandro provided the best XPath 1.0 answer, which has been known for years, since first used by Jeni Tennison almost ten years ago.
The only problem with this expression is its shiny elegance, which makes it difficult to understand by not only novice programmers.
In a hosted XPath 1.0 (and every XPath is hosted!) one can use more understandable expressions:
string((/Foo/Baz | $vDefaults[not(/Foo/Baz/text())]/Foo/Baz)[last())
Here the variable $vDefaults is a separate document that has the same structure as the primary XML document, and whose text nodes contain default values.
Or, if XSLT is the hosting language, one can use the document() function:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:my="my:my">
<xsl:output method="text"/>
<my:defaults>
<Foo>
<Bar/>
<Baz>not-found</Baz>
</Foo>
</my:defaults>
<xsl:template match="/">
<xsl:value-of select=
"concat(/Foo/Baz,
document('')[not(current()/Foo/Baz/text())]
/*/my:defaults/Foo/Baz
)"/>
</xsl:template>
</xsl:stylesheet>
Or, not using concat():
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:my="my:my">
<xsl:output method="text"/>
<my:defaults>
<Foo>
<Bar/>
<Baz>not-found</Baz>
</Foo>
</my:defaults>
<xsl:variable name="vDefaults" select="document('')/*/my:defaults"/>
<xsl:template match="/">
<xsl:value-of select=
"(/Foo/Baz
| $vDefaults/Foo/Baz[not(current()/Foo/Baz/text())]
)
[last()]"/>
</xsl:template>
</xsl:stylesheet>
/Foo/(Baz/string(), 'not-found')[1]
If you are okay with printing an empty string instead of 'not-found' message then use:
/Foo/concat(Baz/text(), '')
Later, you can replace the empty strings with 'not-found'.

how to for every parent node select every not first child node in a tree with multiple parent nodes

His,
I think I've got a tricky questions for XPath experts. There is a node structure like this:
A(1)-|
|-B(1)
|-B(2)
|-B(3)
A(2)-|
|-B(2.1)
|-B(2.2)
|-B(2.3)
...
How to, with a single XPath-expression, extract only the following nodes
A(1)-|
|-B(2)
|-B(3)
A(2)-|
|-B(2.2)
|-B(2.3)
...
That is for every parent node its first child element should be excluded.
I tried A/B[position() != 1] but this would filter out only B(1.1) and select B(2.1).
Thanks
This XPath expression (no preceding-sibling:: axis used):
/*/a/*[not(position()=1)]
when applied on this XML document:
<t>
<a>
<b11/>
<b12/>
<b13/>
</a>
<a>
<b21/>
<b22/>
<b23/>
</a>
</t>
selects the wanted nodes:
<b12 />
<b13 />
<b22 />
<b23 />
This can be verified with this XSLT transformation, producing the above result:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:copy-of select="/*/a/*[not(position()=1)]"/>
</xsl:template>
</xsl:stylesheet>
Tricky. You could select nodes that have preceding siblings:
A/B[preceding-sibling::*]
This will fail for the first element and succeed for the rest.

Resources