XSLT - How to speed up a complex for-each - performance

I am new to XSLT and i'm having a few speed issues with the following for-each statement. I was hoping someone could give me some pointers as how to optimise this please?
The for-each below is looping through about 4mb of XML. It is testing to ensure that each hotel node has a description and a destination. It is also testing that each hotel has a rating greater than 2 but not 6. The possible values for the rating in the XML are 0, 1, 2, 3, 4, 5 or 6. Ideally i would like it to only select ratings 3, 4 or 5 and ignore the others.
<for-each select="response/results/hotel[
not(#description = '') and
#rating > '2' and
not(#rating = '6') and
not(#destination = '') ]">
<call-template name="hotelparams"/>
<call-template name="upropdata"/>
<call-template name="request"/>
<call-template name="Newline"/>
</for-each>
As request I have added the templates that are being called below. The output is creating tab delimited text files which are then imported in mySQL. By the way please ignore the upropdata template, it will be removed shortly...
<xsl:template name="hotelparams">
<xsl:value-of select="#itemcode"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#cheapestcurrency"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#cheapestprice"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#checkin"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#checkout"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#description"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#destair"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#destination"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#destinationid"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#engine"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#hotelname"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#image"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#nights"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#rating"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#resultkey"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#resultno"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#supplierdestination"/><xsl:value-of select="$tab"/>
<xsl:value-of select="#type"/></xsl:template>
<xsl:template name="upropdata">
<xsl:value-of select="$tab"/>\N<xsl:value-of select="$tab"/>\N<xsl:value-of select="$tab"/>\N<xsl:value-of select="$tab"/>\N<xsl:value-of select="$tab"/>\N<xsl:value-of select="$tab"/>2011-01-01</xsl:template>
<xsl:template name="request">
<xsl:for-each select="/response/request/method"><xsl:value-of select="$tab"/><xsl:value-of select="./#sessionkey"/></xsl:for-each></xsl:template>
<xsl:template name="Newline">
<xsl:text>
</xsl:text></xsl:template>

How about ...
<xsl:for-each select="response/results/hotel
[not(#description = '')]
[#rating = (3,4,5)]">
<xsl:call-template name="hotelparams"/>
<xsl:call-template name="upropdata"/>
<xsl:call-template name="request"/>
<xsl:call-template name="Newline"/>
</xsl:for-each>
Note: I have not included a check for destination, because you did not specify its node name.
Also, if you can eliminate the possibility of empty description attributes (that is to say hotels will have a non empty description or no description attribute at all), then you can use this slightly abbreviated form...
<xsl:for-each select="response/results/hotel
[not(#description)]
[#rating = (3,4,5)]">
<xsl:call-template name="hotelparams"/>
etc...
</xsl:for-each>
Also note, an alternate form for the second predicate would be...
[#rating = (3 to 5)]
One could write...
[(#rating > 2) and (#rating < 6)]
or
[#rating > 2][#rating < 6]
... but I suspect that this would be less efficient, because #rating would have to be fetched twice.

The for-each below is looping through about 4mb of XML. It is testing
to ensure that each hotel node has a description and a destination. It
is also testing that each hotel has a rating greater than 2 but not 6.
The possible values for the rating in the XML are 0, 1, 2, 3, 4, 5 or
6. Ideally i would like it to only select ratings 3, 4 or 5 and ignore the others.
<for-each select="response/results/hotel[
not(#description = '') and
#rating > '2' and
not(#rating = '6') and
not(#destination = '') ]">
<call-template name="hotelparams"/>
<call-template name="upropdata"/>
<call-template name="request"/>
<call-template name="Newline"/>
</for-each>
I believe that the reason for the performance problem is in the templates that are being called (and not provided in the question) -- not in the xsl:for-each itself.
It can be re-written in different alternative ways, but the performance gains would be minimal (milliseconds), if any at all.
Do note, that the provided code doesn't check for the existence of a #destination attribute at all. Any hotel element that satisfies the other conditions, but has no destination attribute is selected.
Exactly the same is true for the description attribute.
One correct way of specifying the xsl:for-each is:
<xsl:for-each select="response/results/hotel[
string(#description)
and
#rating > 2
and
not(#rating > 5)
and
string(#destination)
]">
<xsl:call-template name="hotelparams"/>
<xsl:call-template name="upropdata"/>
<xsl:call-template name="request"/>
<xsl:call-template name="Newline"/>
</xsl:for-each>
Update:
The OP has now provided the code of the called templates.
I will use the following for the hotelparams template:
<xsl:sequence select=
"string-join
(
(#itemcode,
#cheapestcurrency,
#cheapestprice,
#checkin,
#checkout,
#description,
#destair,
#destination,
#destinationid,
#engine,
#hotelname,
#image,
#nights,
#rating,
#resultkey,
#resultno,
#supplierdestination,
#type),
$tab
)
"/>
I would replace the template upropdata with:
this code:
<xsl:sequence select="' \N \N \N \N \N2011-01-01'"/>
Or, if $tab really can be something different than , I will calculate this only once and place the result in a global variable:
<xsl:variable name="vUPropData" select=
"concat($tab,'\N',$tab,'\N',$tab,'\N'$tab,'\N',$tab,'\N2011-01-01')"/>
and then just have:
<xsl:sequence select="$vUPropData"/>
I would replace the request template with:
this code:
<xsl:sequence select=
"concat($tab,string-join(/response/request/method/#sessionkey, $tab))"/>
As this doesn't depend on any context node (is an absolute expression), I would calculate this only once and put it in a global variable (as in the previous case) and only reference this global variable.
Finally, it is not meaningful to generate the same single character in a named template. I will replace the Newline template with a global variable or with a global parameter.
I believe that after this refactoring, the code might execute significantly faster.

Related

Storing reference (hash?) to the node for conditional manipulation

Given this pseudo-code:
<xsl:variable name="check0">
<xsl:value-of select="($externalFile//i[#attibute = $variable]/../#start < $genDate) and
($externalFile//i[#attibute = $variable]/../#stop > $genDate)"/>
</xsl:variable>
<xsl:variable name="check1">
<xsl:value-of select="($externalFile//i[#attibute = $variable]/../#start < $genDate) and
($externalFile//i[#attibute = $variable]/../#stop > $genDate)"/>
</xsl:variable>
Above code checks if some variable is in date range of attributes taken from external .xml file
Is there a way to store the reference to the file so that this:
$externalFile//i[#attibute = $variable] look-up doesn't happen 4 times?
Something like this:
<xsl:variable name="check3">
<xsl:value-of select="($externalFile//i[#attibute = $variable]/>
</xsl:variable>
<xsl:if $check3/../#start > someValue />
<xsl:if $check3/../#stop < someValue />
<xsl:variable name="outcome">
<xsl:value-of select="$check3/../#price"/> // <-- retrive some data
</xsl:variable>
The answer is "Yes". You simply need to change your check3 variable to this:
<xsl:variable name="check3" select="$externalFile//i[#attibute = $variable]" />
This way you are referencign the i node in the external file directly, rather than getting the text value of it (which is what xsl:value-of will do)

How to get sum of an attribute value which is referenced by id multiple times with xpath in xslt 1.0?

I really do hope that my title is at least a bit clear.
important: i can only use xslt 1.0 because the project needs to work with the MSXML XSLT processor.
What I try to do:
I generate documents containing information about rooms. Rooms have walls, I need the sum of wall area of these per room.
The input xml file I get is dynamically created by another program.
Changing the structure of the input xml file is not the solution, trust me, it's needed like that and is much more complex than I show you here.
My XML (the innerArea attribute in the wall element has to get summed up):
<root>
<floor id="30" name="EG">
<flat name="Wohnung" nr="1">
<Room id="49" area="93.08565">
<WallSegments>
<WallSegment id="45"/>
<WallSegment id="42"/>
<WallSegment id="39"/>
</WallSegments>
</Room>
</flat>
</floor>
<components>
<Wall id="20" innerArea="20.7654"/>
<wallSegment id="45" wall="20">[...]</wallSegment>
<Wall id="21" innerArea="12.45678"/>
<wallSegment id="42" wall="21">[...]</wallSegment>
<Wall id="22" innerArea="17.8643"/>
<wallSegment id="39" wall="22">[...]</wallSegment>
</components>
</root>
With my XSLT I was able to reach the values of the walls which belong to a room.
But I have really no idea how I could get the sum of the value out of that.
My XSLT:
<xsl:for-each select="flat/Room">
<xsl:for-each select="WallSegments/WallSegment">
<xsl:variable name="curWallSegId" select="#id"/>
<xsl:for-each select="/root/components/wallSegment[#id = $curWallSegId]">
<xsl:variable name="curWallId" select="#wall"/>
<xsl:for-each select="/root/components/Wall[#id = $curWallId]">
<!--I didn't expect that this was working, but at least I tried :D-->
<xsl:value-of select="sum(#AreaInner)"/>
</xsl:for-each>
</xsl:for-each>
</xsl:for-each>
</xsl:for-each>
Desired Output should be something like...
[...]
<paragraph>
Room 1:
Wall area: 51.09 m²
[...]
</paragraph>
[...]
So I hope I described my problem properly. If not: I am sorry, you may beat me right into the face x)
It's best to use keys to get "related" data. Place this at the top of your stylesheet, outside of any template:
<xsl:key name="wall" match="components/Wall" use="#id" />
<xsl:key name="wallSegment" match="components/wallSegment" use="#id" />
Then:
<xsl:for-each select="flat/Room">
<paragraph>
<xsl:text>Room </xsl:text>
<xsl:value-of select="position()"/>
<xsl:text>:
Wall area: </xsl:text>
<xsl:value-of select="format-number(sum(key('wall', key('wallSegment', WallSegments/WallSegment/#id)/#wall)/#innerArea), '0.00m²')"/>
<xsl:text>
</xsl:text>
</paragraph>
</xsl:for-each>
will return:
<paragraph>Room 1:
Wall area: 51.09m²</paragraph>
If what you need it's the area of every room, this is a way of getting it:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:template match="/root/floor">
<xsl:for-each select="flat/Room">
<xsl:variable name="currentRoomSegmentsIds" select="WallSegments/WallSegment/#id"/>
<xsl:variable name="currentRoomWallsIds" select="/root/components/wallSegment[#id = $currentRoomSegmentsIds]/#wall"/>
<xsl:variable name="currentRoomWallsInnerAreas" select="/root/components/Wall[#id = $currentRoomWallsIds]/#innerArea"/>
Id of the room = <xsl:value-of select="#id"/>.
Area of the room = <xsl:value-of select="sum($currentRoomWallsInnerAreas)"/>
</xsl:for-each> <!-- Enf of for each room -->
</xsl:template>
</xsl:stylesheet>
This produces the following result:
Id of the room = 49.
Area of the room = 51.08648

XSLT concatenate input from several nodes in a single output

I'm trying to work out a transformation that will process an input with several Flights with Departure and Arrival into a single output with the complete route for the flights.
Input is as follows:
<FlightTrip>
<flights>
<departureAirport>
<airportCode>LocB</airportCode>
</departureAirport>
<departureTime>2013-03-28T10:00:00.000</departureTime>
<arrivalAirport>
<airportCode>LocC</airportCode>
</arrivalAirport>
</flights>
<flights>
<departureAirport>
<airportCode>LocA</airportCode>
</departureAirport>
<departureTime>2013-03-27T15:00:00.000</departureTime>
<arrivalAirport>
<airportCode>LocB</airportCode>
</arrivalAirport>
</flights>
<flights>
<departureAirport>
<airportCode>LocC</airportCode>
</departureAirport>
<departureTime>2013-03-30T14:00:00.000</departureTime>
<arrivalAirport>
<airportCode>LocD</airportCode>
</arrivalAirport>
</flights>
</FlightTrip>
The desired output would be this:
<FullTrip>LocA LocB LocC LocD</FullTrip>
I've tried to use foreach inside the output variable but I can't get it right. I also need to sort the input based on the departure date as the Flights can be in a different order (as per the sample input).
Any ideas of how to achieve this?
Thanks a lot!
Bruno
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output indent="yes"/>
<xsl:template match="FlightTrip">
<FullTrip>
<xsl:apply-templates select="flights">
<xsl:sort select="departureTime"/>
</xsl:apply-templates>
</FullTrip>
</xsl:template>
<xsl:template match="flights">
<xsl:value-of select="departureAirport/airportCode"/><xsl:text> </xsl:text>
<xsl:if test="position()=last()">
<xsl:value-of select="arrivalAirport/airportCode"/>
</xsl:if>
</xsl:template>
</xsl:transform>
Will produce:
<FullTrip>LocA LocB LocC LocD</FullTrip>
Working example
Thanks to Joepie for the enlightenment. I had to modify it a bit to get it to work in my environment, ended up using foreach as below:
<xsl:template match="/">
<xsl:variable name="locations">
<xsl:for-each select="/FlightTrip/flights">
<xsl:sort select="departureTime" order="ascending" data-type="text"/>
<xsl:value-of select="concat(departureAirport/airportCode,' - ')"/>
<xsl:if test="position() = last()">
<xsl:value-of select="arrivalAirport/airportCode"/>
</xsl:if>
</xsl:for-each>
</xsl:variable>
<FullTrip>
<xsl:value-of select="$locations"/>
</FullTrip>
</xsl:template>
When applied to the example produces the output below:
<FullTrip>LocA - LocB - LocC - LocD</FullTrip>
Thanks again!

Set Union Operator in xpath

<xsl:variable name="targetReceiverService">
<EMP_EMPLOC_MAL curr="4.0">MAL</EMP_EMPLOC_MAL>
<EMP_EMPLOC_SIN curr="1.6">SIN</EMP_EMPLOC_SIN>
<EMP_EMPLOC_CHN curr="7.8">CHN</EMP_EMPLOC_CHN>
<DEFAULT curr="1.0">NONE</DEFAULT>
</xsl:variable>
<xsl:variable name="targetCountryCode" select="$targetReceiverService/*[name() = $ReceiverService] | $targetReceiverService/DEFAULT"/>
<xsl:value-of select="$targetCountryCode "/>
why value display for $targetCountryCode is only MAL but not included NONE since the "|" mean
xsl:value-of only displays the value of the first node in the set. (at least with XSL 1)
You can probably display all of them with
<xsl:for-each select="$targetCountryCode">
<xsl:value-of select="."/>
</xsl:for-each>

is there a way to emulate correctly replace function on XPATH 1?

I have this function which tries to replace dots and/or - with _
I'm limited to use xpath 1 so replace function is NOT an option. The template works not to much fine because if I use something like this:
FOO-BAR.THING-MADRID.html
it gives me out on screen this thing:
FOO-BAR.THING-MADRID.html
the middle dot is not replaced.
Someone could help me?
<xsl:template name="replaceDots">
<xsl:param name="outputString"/>
<xsl:variable name="target">.</xsl:variable>
<xsl:variable name="source">-</xsl:variable>
<xsl:variable name="replacement">_</xsl:variable>
<xsl:choose>
<xsl:when test="contains($outputString,$source)">
<xsl:value-of select="concat(substring-before($outputString,$source),$replacement)" disable-output-escaping="yes"/>
<xsl:call-template name="replaceDots">
<xsl:with-param name="outputString" select="substring-after($outputString,$source)"/>
</xsl:call-template>
</xsl:when>
<xsl:when test="contains($outputString,$target)">
<xsl:value-of select="concat(substring-before($outputString,$target),$replacement)" disable-output-escaping="yes"/>
<xsl:call-template name="replaceDots">
<xsl:with-param name="outputString" select="substring-after($outputString,$target)"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$outputString" disable-output-escaping="yes"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
To replace all dots or dashes with underscores, you do not need an <xsl:template>. You can use:
<xsl:value-of select="translate(., '-.', '__')" />
If you want to keep the ".html", you can extend this like so:
<xsl:value-of select="
concat(
translate(substring-before(., '.html'), '-.', '__'),
'.hmtl'
)
" />
For a generic "string replace" template in XSLT, look at this question, for example.

Resources