How to eliminate all <TAG/> and all attribute="" by XSLT? - xpath

In a xsl:stylesheet I have this "identity like" transform, to eliminate comments, empty (terminal) tags and empty attributes... But the second xsl:when not works
<xsl:template match="node()">
<xsl:choose>
<xsl:when test="name()='p' and not(./*) and not(normalize-space(.))"></xsl:when>
<xsl:when test="not(name()='img') and not(name()='br') and not(./*) and not(text())"
></xsl:when> <!-- this line NOT WORKS -->
<xsl:otherwise><xsl:copy><xsl:apply-templates select="#*|node()"/></xsl:copy></xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="#*">
<xsl:choose>
<xsl:when test="not(normalize-space(.))"></xsl:when>
<xsl:otherwise><xsl:copy><xsl:apply-templates select="#*|node()"/></xsl:copy></xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="comment()"></xsl:template>
Whow to express condition to empty tags in this context?
PS: the "empty rules" are explained here, I try to use it, but not see why not working.

An empty element is an element with no child nodes.
Template match priority is your friend ... the following should be the kind of identity stylesheet that meets your description plus what I think you are doing with image and break elements.
<?xml version="1.0" encoding="US-ASCII"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<!--toss these-->
<xsl:template match="comment() |
*[not(node())] |
#*[not(normalize-space())]"/>
<!--preserve these-->
<xsl:template match="img|br" priority="1">
<xsl:call-template name="identity"/>
</xsl:template>
<!--preserve everything else-->
<xsl:template match="#*|node()" name="identity">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

Related

Constructing variable with template in XSLT and then applying xpath

I am using following xslt
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.test.com/services/test/test/v1">
<xsl:output method="xml" encoding="UTF-8"
omit-xml-declaration="yes" indent="yes" />
<xsl:strip-space elements="*" />
<xsl:template match="/">
<xsl:variable name="mytree">
<xsl:call-template name="myvariable">
</xsl:call-template>
</xsl:variable>
<xsl:choose>
<xsl:when test="count($mytree/foos/foo) > 1">
<xsl:copy-of select="$mytree"/>
</xsl:when>
<xsl:otherwise>
<error>test</error>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template name="myvariable">
<foos>
<foo>bar1</foo>
<foo>bar2</foo>
<foo>bar3</foo>
<foo>bar4</foo>
</foos>
</xsl:template>
</xsl:stylesheet>
When i use above xslt it should be following output
<foos xmlns="http://www.test.com/services/test/test/v1">
<foo>bar1</foo>
<foo>bar2</foo>
<foo>bar3</foo>
<foo>bar4</foo>
</foos>
but it is
<error xmlns="http://www.test.com/services/test/test/v1">test</error>
when i remove the xmlns="http://www.test.com/services/test/test/v1" output is proper. Not sure what is happening?
Well, with any XML, whether constructed inside of your XSLT or read from a source, if you have elements in a certain namespace, then, to select them with XPath in XSLT, in XSLT 2 you have two options, either use xpath-default-namespace="http://www.test.com/services/test/test/v1" (e.g. <xsl:when test="count($mytree/foos/foo) > 1" xpath-default-namespace="http://www.test.com/services/test/test/v1">) or bind the namespace to a prefix (e.g. <xsl:when xmlns:v1="http://www.test.com/services/test/test/v1" test="count($mytree/v1:foos/v1:foo) > 1">).
You can use these approaches on an ancestor element, for instance the root element of the stylesheet, if it does not interfere with other selections you want to make.
You have to specify qualified element names in your XPath expression to address the foos and foo elements in your default namespace http://www.test.com/services/test/test/v1:
Register the default namespace once more with a namespace prefix (e.g. myns): xmlns:myns="http://www.test.com/services/test/test/v1"
Use that namepace prefix in your XPath expressions to address nodes in that namespace (e.g. myns:foos/myns:foo).
Add exclude-result-prefixes="myns" to suppress the myns prefix in your result document.
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.test.com/services/test/test/v1"
xmlns:myns="http://www.test.com/services/test/test/v1"
exclude-result-prefixes="myns">
…
<xsl:template match="/">
…
<xsl:choose>
<xsl:when test="count($mytree/myns:foos/myns:foo) > 1">
<xsl:copy-of select="$mytree"/>
</xsl:when>
<xsl:otherwise>
<error>test</error>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
…
</xsl:stylesheet>
see XSLT Fiddle
If you only had an XSLT 1.0 processor at hand, you would need the EXSLT node-set function to access the $mytree variables from the result tree:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.test.com/services/test/test/v1"
xmlns:exsl="http://exslt.org/common"
extension-element-prefixes="exsl"
xmlns:myns="http://www.test.com/services/test/test/v1"
exclude-result-prefixes="myns">
…
<xsl:template match="/">
…
<xsl:choose>
<xsl:when test="count(exsl:node-set($mytree)/myns:foos/myns:foo) > 1">
<xsl:copy-of select="$mytree"/>
</xsl:when>
<xsl:otherwise>
<error>test</error>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
…
</xsl:stylesheet>
Use code for remove namespace
<xsl:template match="#*[namespace-uri() = 'http://www.test.com/services/test/test/v1']"/>

Abbreviating text with whitespace with XSLT

I want to extract short lemmas out of text for some explanatory notes. That is, if the text is too long it should output only the first and the last word. This works:
<?xml version="1.0" encoding="UTF-8"?>
<lemma>
<a><b>I</b> can what I can and <b><c>what</c></b> I can't I can</a>
</lemma>
when this xslt is applied
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
version="2.0">
<xsl:output method="xml" encoding="utf-8" indent="yes"/>
<!-- Identity template : copy all text nodes, elements and attributes -->
<xsl:template match="#*|node()">
<xsl:copy copy-namespaces="no">
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="lemma">
<xsl:value-of select="."/>
<xsl:choose>
<xsl:when test="string-length(normalize-space(a)) > 20">
<xsl:value-of select="tokenize(a,' ')[1]"/>
<xsl:text> […] </xsl:text>
<xsl:value-of select="tokenize(a,' ')[last()]"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="a"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
produces the desired output:
I can what I can and what I can't I can
I […] can
Unfortunately whenever two child elements are immediately adjacent the space in between is coded as child-node named „space“. The above solution doesn't work with:
<lemma>
<a><b>I</b><space/><b>can</b> what I can and what I can't I can</a>
</lemma>
I tried to have the single space-special character processed before, but that doesn't work (and I know why), I just don't know how to do it better. It would work with two XLST-runs, I suppose.
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
version="2.0">
<xsl:output method="xml" encoding="utf-8" indent="yes"/>
<!-- Identity template : copy all text nodes, elements and attributes -->
<xsl:template match="#*|node()">
<xsl:copy copy-namespaces="no">
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="space">
 
</xsl:template>
<xsl:template match="lemma">
<xsl:apply-templates select="space"/>
<xsl:value-of select="."/>
<xsl:choose>
<xsl:when test="string-length(normalize-space(a)) > 20">
<xsl:value-of select="tokenize(a,' ')[1]"/>
<xsl:text> […] </xsl:text>
<xsl:value-of select="tokenize(a,' ')[last()]"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="a"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
Output:
Ican what I can and what I can't I can
Ican […] can
You could do an xsl:apply-templates to process a and save it in a variable...
XML Input
<doc>
<lemma>
<a><b>I</b> can what I can and <b><c>what</c></b> I can't I can</a>
</lemma>
<lemma>
<a><b>I</b><space/><b>can</b> what I can and what I can't I can</a>
</lemma>
</doc>
XSLT 2.0
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="space">
<xsl:text> </xsl:text>
</xsl:template>
<xsl:template match="lemma">
<xsl:variable name="a">
<xsl:apply-templates select="a"/>
</xsl:variable>
<xsl:variable name="norm" select="normalize-space($a)"/>
<xsl:variable name="tokens" select="tokenize($norm,'\s')"/>
<xsl:copy>
<result>
<xsl:value-of select="$norm"/>
</result>
<result>
<xsl:value-of select="
if (string-length($norm) > 20) then
concat($tokens[1],' […] ', $tokens[last()])
else $norm"/>
</result>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
XML Output
<doc>
<lemma>
<result>I can what I can and what I can't I can</result>
<result>I […] can</result>
</lemma>
<lemma>
<result>I can what I can and what I can't I can</result>
<result>I […] can</result>
</lemma>
</doc>

XSLT 2.0 dynamic XPATH expression

I have one XML file that I need to transform based on a mapping file with XSLT 2.0. I'm using the Saxon HE processor.
My mapping file:
<element root="TEST">
<childName condition="/TEST/MyElement/CHILD[text()='B']>/TEST/MyElement/CHILD</childName>
<childBez condition="/TEST/MyElement/CHILD[text()='B']>/TEST/MyElement/CHILDBEZ</childBez>
</element>
I have to copy the elements CHILD and CHILDBEZ plus the parent and the root elements when the text of CHILD equals B.
So with this Input:
<?xml version="1.0" encoding="UTF-8"?>
<TEST>
<MyElement>
<CHILD>A</CHILD>
<CHILDBEZ>ABEZ</CHILDBEZ>
<NotInteresting></NotInteresting>
</MyElement>
<MyElement>
<CHILD>B</CHILD>
<CHILDBEZ>BBEZ</CHILDBEZ>
<NotInteresting2></NotInteresting2>
</MyElement>
</TEST>
the desired output:
<TEST>
<MyElement>
<childName>B</childName>
<childBez>BBEZ</childBez>
</MyElement>
</TEST>
what I have so far (based on this solution XSLT 2.0 XPATH expression with variable):
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:strip-space elements="*"/>
<xsl:param name="mapping" select="document('mapping.xml')"/>
<xsl:key name="map" match="*" use="."/>
<xsl:template match="/">
<xsl:variable name="first-pass">
<xsl:apply-templates mode="first-pass"/>
</xsl:variable>
<xsl:apply-templates select="$first-pass/*"/>
</xsl:template>
<xsl:template match="*" mode="first-pass">
<xsl:param name="parent-path" tunnel="yes"/>
<xsl:variable name="path" select="concat($parent-path, '/', name())"/>
<xsl:variable name="replacement" select="key('map', $path, $mapping)"/>
<xsl:variable name="condition" select="key('map', $path, $mapping)/#condition"/>
<xsl:choose>
<xsl:when test="$condition!= ''">
<!-- if there is a condition defined in the mapping file, check for it -->
</xsl:when>
<xsl:otherwise>
<xsl:element name="{if ($replacement) then name($replacement) else name()}">
<xsl:attribute name="original" select="not($replacement)"/>
<xsl:apply-templates mode="first-pass">
<xsl:with-param name="parent-path" select="$path" tunnel="yes"/>
</xsl:apply-templates>
</xsl:element>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<xsl:template match="*">
<xsl:copy>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="*[#original='true' and not(descendant::*/#original='false')]"/>
</xsl:stylesheet>
but the problem is that it's impossible to evaluate dynamic XPATH expressions with XSLT 2.0. Does anyone knows a workaround for that? Plus I have a problem with the mapping file. When there is only one element in it, it's not working at all.
If dynamic XPath evaluation isn't an option in your chosen processor, then generating an XSLT stylesheet is often a good alternative. In fact, it's often a good alternative anyway.
One way of thinking about this is that your mapping file is actually a program written in a very simple transformation language. There are two ways of executing this program: you can write an interpreter (dynamic XPath evaluation), or you can write a compiler (XSLT stylesheet generation). Both work well.

adding attribute to an existing node

i need to add an attribute initial-page-number to a tag fo:sequence
tha tag is
<fo:page-sequence master-reference="alternating" initial-page-number="1"><fo:page-sequence>
..
...
</fo:page-sequence>
become
<fo:page-sequence master-reference="alternating" initial-page-number="1">
..
</fo:page-sequence>
but with the xslt i obtain two fo:page:
<fo:page-sequence master-reference="alternating" initial-page-number="1"><fo:page-sequence>
</fo:page-sequence></fo:page-sequence>
How can i replace old fo:page-sequence with new one?
This is my xsl stylesheet:
<xsl:stylesheet>
<xsl:template match="ss:split/fo:page-sequence">
<xsl:choose>
<xsl:when test="#master-reference['alternating']">
<xsl:element name="fo:page-sequence">
<xsl:for-each select="#*">
<xsl:attribute name="{name()}"><xsl:value-of select="."/></xsl:attribute>
</xsl:for-each>
<xsl:attribute name="initial-page-number">
<xsl:value-of select="1"/>
</xsl:attribute>
<xsl:copy>
<xsl:apply-templates select="child::*"/>
</xsl:copy>
</xsl:element>
</xsl:when>
</xsl:choose>
</xsl:template>
<xsl:template match='comment()'>
<xsl:comment><xsl:value-of select="."/></xsl:comment>
</xsl:template>
<xsl:template match="#*|*">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Your stylesheet changes every fo:page-sequence because the predicate ['alternating'] is always true.
You can check for the master-reference value in the match pattern, plus you can just copy the existing attributes, and you can copy the contents of the fo:page-sequence since it won't contain another fo:page-sequence:
<xsl:template
match="ss:split/fo:page-sequence[#master-reference = 'alternating']">
<xsl:copy>
<xsl:copy-of select="#*" />
<xsl:attribute name="initial-page-number">1</xsl:attribute>
<xsl:copy-of select="node()" />
</xsl:copy>
</xsl:template>
Your stylesheet creates an fo:page-sequence using <xsl:element name="fo:page-sequence">, and another one with <xsl:copy> (as the matching element is an fo:page-sequence).
Just remove the xsl:copy (but leave <xsl:apply-templates select="child::*"/>, as you want to process the children of the current node!) and you should get what you need.

How to order self-referencing xml

I have a list of order lines with each one product on them. The products in may form a self-referencing hierarchy. I need to order the lines in such a way that all products that have no parent or whose parent is missing from the order are at the top, followed by their children. No child may be above its parent in the end result.
So how can i order the following xml:
<order>
<line><product code="3" parent="1"/></line>
<line><product code="2" parent="1"/></line>
<line><product code="6" parent="X"/></line>
<line><product code="1" /></line>
<line><product code="4" parent="2"/></line>
</order>
Into this:
<order>
<line><product code="6" parent="X"/></line>
<line><product code="1" /></line>
<line><product code="2" parent="1"/></line>
<line><product code="3" parent="1"/></line>
<line><product code="4" parent="2"/></line>
</order>
Note that the order within a specific level is not important, as long as the child node follows at some point after it's parent.
I have a solution which works for hierarchies that do not exceed a predefined depth:
<order>
<xsl:variable name="level-0"
select="/order/line[ not(product/#parent=../line/product/#code) ]"/>
<xsl:for-each select="$level-0">
<xsl:copy-of select="."/>
</xsl:for-each>
<xsl:variable name="level-1"
select="/order/line[ product/#parent=$level-0/product/#code ]"/>
<xsl:for-each select="$level-1">
<xsl:copy-of select="."/>
</xsl:for-each>
<xsl:variable name="level-2"
select="/order/line[ product/#parent=$level-1/product/#code ]"/>
<xsl:for-each select="$level-2">
<xsl:copy-of select="."/>
</xsl:for-each>
</order>
The above sample xslt will work for hierarchies with a maximum depth of 3 levels and is easily extended to more, but how can i generalize this and have the xslt sort arbitrary levels of depth correctly?
To start with, you could define a couple of keys to help you look up the line elements by either their code or parent attribute
<xsl:key name="products-by-parent" match="line" use="product/#parent" />
<xsl:key name="products-by-code" match="line" use="product/#code" />
You would start off by selecting the line elements with no parent, using a key to do this check:
<xsl:apply-templates select="line[not(key('products-by-code', product/#parent))]"/>
Then, within the template that matches the line element, you would just copy the element, and then select its "children" like so, using the other key
<xsl:apply-templates select="key('products-by-parent', product/#code)"/>
This would be a recursive call, so it would recursively look for its children until no more are found.
Try this XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:key name="products-by-parent" match="line" use="product/#parent"/>
<xsl:key name="products-by-code" match="line" use="product/#code"/>
<xsl:template match="order">
<xsl:copy>
<xsl:apply-templates select="line[not(key('products-by-code', product/#parent))]"/>
</xsl:copy>
</xsl:template>
<xsl:template match="line">
<xsl:call-template name="identity"/>
<xsl:apply-templates select="key('products-by-parent', product/#code)"/>
</xsl:template>
<xsl:template match="#*|node()" name="identity">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Do note the use of the XSLT identity transform to copy the existing nodes in the XML.
Very interesting problem. I would do this in two passes: first, nest the elements according to their hierarchy. Then output the elements, sorted by the count of their ancestors.
XSLT 1.0 (+ EXSLT node-set() function):
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
extension-element-prefixes="exsl">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:key name="product-by-code" match="product" use="#code" />
<!-- first pass -->
<xsl:variable name="nested">
<xsl:apply-templates select="/order/line/product[not(key('product-by-code', #parent))]" mode="nest"/>
</xsl:variable>
<xsl:template match="product" mode="nest">
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:apply-templates select="../../line/product[#parent=current()/#code]" mode="nest"/>
</xsl:copy>
</xsl:template>
<!-- output -->
<xsl:template match="/order">
<xsl:copy>
<xsl:for-each select="exsl:node-set($nested)//product">
<xsl:sort select="count(ancestor::*)" data-type="number" order="ascending"/>
<line><product><xsl:copy-of select="#*"/></product></line>
</xsl:for-each>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
When applied to your input, the result is:
<?xml version="1.0" encoding="UTF-8"?>
<order>
<line>
<product code="6" parent="X"/>
</line>
<line>
<product code="1"/>
</line>
<line>
<product code="3" parent="1"/>
</line>
<line>
<product code="2" parent="1"/>
</line>
<line>
<product code="4" parent="2"/>
</line>
</order>
This still leaves the issue of the existing/missing parent X - I will try to address that later.

Resources