xpath : compare tokenized list with another tokenized list - xpath

In xslt I have a variable which has a single string with space as delimiter.
var_1 = 'cat dog cow'
The following xml is present for which the 'name' attribute has value with space as delimiter.
<top_element>
<element_10>
<element_11>
<element_000 name="cow cat">
string_1
</element_000>
</element_11>
</element_10>
<element_20 name="bat">
<element_21>
<element_000 name="cow cat">
string_2
</element_000>
</element_21>
</element_20>
<element_30 name="bat dog">
<element_31>
<element_000 name="cow cat">
string_3
</element_000>
</element_31>
</element_30>
<element_40 >
<element_41>
<element_000 name="cow bat">
string_4
</element_000>
</element_41>
</element_40>
</top_element>
Question:
Handle the element_000 in the xml only if, for ancestor or self of element:
name attribute is not defined or
value of 'name' attribute contains at least one value of 'var_1.
While parsing the xml using xslt, the output html should contain only the following strings:
string_1
string_3
string_4
The string_2 should not be displayed, as its ancestor has name value which does not match with any of the values in the var_1 list.
My try:
Note: I am using xslt version 2.0
<xsl:template match="element_000">
<choose>
<xsl:when test"(ancestor-or-self::*[(tokenize(#name,'\s+')) != (tokenize($var_1,'\s+'))])">
<--Handle the element-->
</xsl:when>
</xsl:choose>
</xsl:template>
This approach did not work for me. Please let me know if this is possible by some other way.
Thanks
SRB.

If you use
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:param name="var_1" select="'cat dog cow'"/>
<xsl:variable name="names" select="tokenize($var_1, ' ')"/>
<xsl:template match="text()"/>
<xsl:template match="element_000[every $el in ancestor-or-self::*
satisfies (not(exists($el/#name)) or $names = tokenize($el/#name, ' '))]">
<xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>
then the output contains only the elements
<element_000 name="cow cat">
string_1
</element_000><element_000 name="cow cat">
string_3
</element_000><element_000 name="cow bat">
string_4
</element_000>

Related

Getting invalid date issue while comparing dates in XSLT code

I have a requirement wherein I have to validate couple of scenarios: the offer start date should fall before offer end date and the offer start date account should fall after account start date. If any of the scenario is not met error should be thrown.
Offer start date and offer end date values will appear in space separated formats in xml tag and xml tags respectively.
Below is the sample xml code:
<Accounts>
<Account>
<AccountStartDate>2020-12-01<AccountStartDate>
<offerStartDate>2020-10-02 2020-11-02</offerStartDate>
<offerEndDate>2019-10-02 2019-11-02</offerEndDate>
</Account>
</Accounts>
Below is the sample xslt code:
<xsl:for-each select="Accounts/Account">
<xsl:variable name="offerSDate" select="offerStartDate"/>
<xsl:variable name="offerEDate" select="offerEndDate"/>
<xsl:if test="$offerSDate > xs:date(AccountStartDate)">
<Error>
<xsl:text>Error: Invalid offer Date
</xsl:text>
</Error>
</xsl:if>
<xsl:if test="$offerSDate > $offerEDate">
<Error>
<xsl:text>Error: Invalid offer Date
</xsl:text>
</Error>
</xsl:if>
</xsl:for-each>
After execution of the xslt code, I am getting the invalid date "2020-10-02 2020-11-02""issue.
If you want to do a separate comparison for each date in offerStartDate, then you could do (in XSLT 2.0) either:
<xsl:for-each select="Account">
<xsl:if test="some $offerStartDate in tokenize(offerStartDate, ' ') satisfies xs:date($offerStartDate) gt xs:date(AccountStartDate)">
<Error>error message</Error>
</xsl:if>
</xsl:for-each>
or (depending on what meaning your test should have):
<xsl:for-each select="Account">
<xsl:if test="every $offerStartDate in tokenize(offerStartDate, ' ') satisfies xs:date($offerStartDate) gt xs:date(AccountStartDate)">
<Error>error message</Error>
</xsl:if>
</xsl:for-each>
Probably the easiest way to do it with only XSLT is to convert your XML from:
<Accounts>
<Account>
<AccountStartDate>2020-12-01</AccountStartDate>
<offerStartDate>2020-10-02 2020-11-02</offerStartDate>
<offerEndDate>2019-10-02 2019-11-02</offerEndDate>
</Account>
</Accounts>
To something like:
<Accounts>
<Account>
<AccountStartDate>2020-12-01</AccountStartDate>
<offer>
<offerStartDate>2020-10-02</offerStartDate>
<offerEndDate>2019-10-02</offerEndDate>
</offer>
<offer>
<offerStartDate>2020-11-02</offerStartDate>
<offerEndDate>2019-11-02</offerEndDate>
</offer>
</Account>
</Accounts>

xslt Merge children of 2 parents and Store in a variable

I receive an xml input like this:
<root>
<Tuple1>
<child11></child11>
<child12></child12>
<child13></child13>
</Tuple1>
<Tuple1>
<child11></child11>
<child12></child12>
</Tuple1>
<Tuple2>
<child21></child21>
<child22></child22>
</Tuple2>
<Tuple2>
<child21></child21>
<child22></child22>
<child23></child23>
</Tuple2>
</root>
How can I merge the children of each Tuple1 with children of Tuple2 and store them in a variable that will be used in the rest of xslt document?
First tuple1 will be merged with first Tuple2 and second Tuple1 will be merged with 2nd Tuple2 and so on. The merged output that should be stored in variable would look like this in memory:
<root>
<Tuple1>
<child11></child11>
<child12></child12>
<child13></child13>
<child21></child21>
<child22></child22>
</Tuple1>
<Tuple1>
<child11></child11>
<child12></child12>
<child21></child21>
<child22></child22>
<child23></child23>
</Tuple1>
</root>
Is variable the best option? If we use variable, is it created once or it is created every time called?
I use xslt 3.0 so solution for any version can help.
Thanks and I appreciate your help)
Here is a minimal XSLT 3 approach:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="3.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="root">
<xsl:variable name="temp1">
<xsl:copy>
<xsl:apply-templates select="Tuple1"/>
</xsl:copy>
</xsl:variable>
<xsl:copy-of select="$temp1"/>
</xsl:template>
<xsl:template match="Tuple1">
<xsl:copy>
<xsl:copy-of select="*, let $pos := position() return ../Tuple2[$pos]/*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Online at https://xsltfiddle.liberty-development.net/bdxtqg, I have used XPath's let instead of XSLT's xsl:variable to store the position to access the specific Tuple2.

how can I get a list of indexes of nodes that have a value using xpath

using the following;
<a>
<b>false</b>
<b>true</b>
<b>false</b>
<b>false</b>
<b>true</b>
</a>
I want to get the following result using something like
/a/b[.='true'].position()
for a result like
2,5 (as in a collection of the 2 positions)
I. XPath 1.0 solution:
Use:
count(/*/*[.='true'][1]/preceding-sibling::*)+1
This produces the position of the first b element whose string value is "true":
2
Repeat the evaluation of a similar expression, where [1] is replaced by [2] ,..., etc, up to count(/*/*[.='true'])
XSLT - based verification:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:for-each select="/*/*[.='true']">
<xsl:variable name="vPos" select="position()"/>
<xsl:value-of select=
"count(/*/*[.='true'][$vPos]
/preceding-sibling::*) +1"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the provided XML document:
<a>
<b>false</b>
<b>true</b>
<b>false</b>
<b>false</b>
<b>true</b>
</a>
The XPath expression is constructed and evaluated for everyb, whose string value is"true". The results of these evaluations are copied to the output:
2
5
II. XPath 2.0 solution:
Use:
index-of(/*/*, 'true')
XSLT 2.0 - based verification:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:sequence select="index-of(/*/*, 'true')"/>
</xsl:template>
</xsl:stylesheet>
When this XSLT 2.0 transformation is applied on the same XML document (above), the XPath 2.0 expression is evaluated and the result of this evaluation is copied to the output:
2 5
A basic (& working) approach in python language :
from lxml import etree
root = etree.XML("""
<a>
<b>false</b>
<b>true</b>
<b>false</b>
<b>false</b>
<b>true</b>
</a>
""")
c = 0
lst = []
for i in root.xpath('/a/b/text()'):
c+=1
if i == 'true':
lst.append(str(c))
print ",".join(lst)

Set Union Operator in xpath

<xsl:variable name="targetReceiverService">
<EMP_EMPLOC_MAL curr="4.0">MAL</EMP_EMPLOC_MAL>
<EMP_EMPLOC_SIN curr="1.6">SIN</EMP_EMPLOC_SIN>
<EMP_EMPLOC_CHN curr="7.8">CHN</EMP_EMPLOC_CHN>
<DEFAULT curr="1.0">NONE</DEFAULT>
</xsl:variable>
<xsl:variable name="targetCountryCode" select="$targetReceiverService/*[name() = $ReceiverService] | $targetReceiverService/DEFAULT"/>
<xsl:value-of select="$targetCountryCode "/>
why value display for $targetCountryCode is only MAL but not included NONE since the "|" mean
xsl:value-of only displays the value of the first node in the set. (at least with XSL 1)
You can probably display all of them with
<xsl:for-each select="$targetCountryCode">
<xsl:value-of select="."/>
</xsl:for-each>

XSLT - How to speed up a complex for-each

I am new to XSLT and i'm having a few speed issues with the following for-each statement. I was hoping someone could give me some pointers as how to optimise this please?
The for-each below is looping through about 4mb of XML. It is testing to ensure that each hotel node has a description and a destination. It is also testing that each hotel has a rating greater than 2 but not 6. The possible values for the rating in the XML are 0, 1, 2, 3, 4, 5 or 6. Ideally i would like it to only select ratings 3, 4 or 5 and ignore the others.
<for-each select="response/results/hotel[
not(#description = '') and
#rating > '2' and
not(#rating = '6') and
not(#destination = '') ]">
<call-template name="hotelparams"/>
<call-template name="upropdata"/>
<call-template name="request"/>
<call-template name="Newline"/>
</for-each>
As request I have added the templates that are being called below. The output is creating tab delimited text files which are then imported in mySQL. By the way please ignore the upropdata template, it will be removed shortly...
<xsl:template name="hotelparams">
<xsl:value-of select="#itemcode"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#cheapestcurrency"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#cheapestprice"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#checkin"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#checkout"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#description"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#destair"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#destination"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#destinationid"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#engine"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#hotelname"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#image"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#nights"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#rating"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#resultkey"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#resultno"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#supplierdestination"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#type"/></xsl:template>
<xsl:template name="upropdata">
<xsl:value-of select="$tab"/>\N<xsl:value-of select="$tab"/>\N<xsl:value-of select="$tab"/>\N<xsl:value-of select="$tab"/>\N<xsl:value-of select="$tab"/>\N<xsl:value-of select="$tab"/>2011-01-01</xsl:template>
<xsl:template name="request">
<xsl:for-each select="/response/request/method"><xsl:value-of select="$tab"/><xsl:value-of select="./#sessionkey"/></xsl:for-each></xsl:template>
<xsl:template name="Newline">
<xsl:text>
</xsl:text></xsl:template>
How about ...
<xsl:for-each select="response/results/hotel
[not(#description = '')]
[#rating = (3,4,5)]">
<xsl:call-template name="hotelparams"/>
<xsl:call-template name="upropdata"/>
<xsl:call-template name="request"/>
<xsl:call-template name="Newline"/>
</xsl:for-each>
Note: I have not included a check for destination, because you did not specify its node name.
Also, if you can eliminate the possibility of empty description attributes (that is to say hotels will have a non empty description or no description attribute at all), then you can use this slightly abbreviated form...
<xsl:for-each select="response/results/hotel
[not(#description)]
[#rating = (3,4,5)]">
<xsl:call-template name="hotelparams"/>
etc...
</xsl:for-each>
Also note, an alternate form for the second predicate would be...
[#rating = (3 to 5)]
One could write...
[(#rating > 2) and (#rating < 6)]
or
[#rating > 2][#rating < 6]
... but I suspect that this would be less efficient, because #rating would have to be fetched twice.
The for-each below is looping through about 4mb of XML. It is testing
to ensure that each hotel node has a description and a destination. It
is also testing that each hotel has a rating greater than 2 but not 6.
The possible values for the rating in the XML are 0, 1, 2, 3, 4, 5 or
6. Ideally i would like it to only select ratings 3, 4 or 5 and ignore the others.
<for-each select="response/results/hotel[
not(#description = '') and
#rating > '2' and
not(#rating = '6') and
not(#destination = '') ]">
<call-template name="hotelparams"/>
<call-template name="upropdata"/>
<call-template name="request"/>
<call-template name="Newline"/>
</for-each>
I believe that the reason for the performance problem is in the templates that are being called (and not provided in the question) -- not in the xsl:for-each itself.
It can be re-written in different alternative ways, but the performance gains would be minimal (milliseconds), if any at all.
Do note, that the provided code doesn't check for the existence of a #destination attribute at all. Any hotel element that satisfies the other conditions, but has no destination attribute is selected.
Exactly the same is true for the description attribute.
One correct way of specifying the xsl:for-each is:
<xsl:for-each select="response/results/hotel[
string(#description)
and
#rating > 2
and
not(#rating > 5)
and
string(#destination)
]">
<xsl:call-template name="hotelparams"/>
<xsl:call-template name="upropdata"/>
<xsl:call-template name="request"/>
<xsl:call-template name="Newline"/>
</xsl:for-each>
Update:
The OP has now provided the code of the called templates.
I will use the following for the hotelparams template:
<xsl:sequence select=
"string-join
(
(#itemcode,
#cheapestcurrency,
#cheapestprice,
#checkin,
#checkout,
#description,
#destair,
#destination,
#destinationid,
#engine,
#hotelname,
#image,
#nights,
#rating,
#resultkey,
#resultno,
#supplierdestination,
#type),
$tab
)
"/>
I would replace the template upropdata with:
this code:
<xsl:sequence select="' \N \N \N \N \N2011-01-01'"/>
Or, if $tab really can be something different than , I will calculate this only once and place the result in a global variable:
<xsl:variable name="vUPropData" select=
"concat($tab,'\N',$tab,'\N',$tab,'\N'$tab,'\N',$tab,'\N2011-01-01')"/>
and then just have:
<xsl:sequence select="$vUPropData"/>
I would replace the request template with:
this code:
<xsl:sequence select=
"concat($tab,string-join(/response/request/method/#sessionkey, $tab))"/>
As this doesn't depend on any context node (is an absolute expression), I would calculate this only once and put it in a global variable (as in the previous case) and only reference this global variable.
Finally, it is not meaningful to generate the same single character in a named template. I will replace the Newline template with a global variable or with a global parameter.
I believe that after this refactoring, the code might execute significantly faster.

Resources