Xpath: filter out childs - xpath

I'm looking for a xpath expression that filters out certain childs. A child must contain a CCC node with B in it.
Source:
<AAA>
<BBB1>
<CCC>A</CCC>
</BBB1>
<BBB2>
<CCC>A</CCC>
</BBB2>
<BBB3>
<CCC>B</CCC>
</BBB3>
<BBB4>
<CCC>B</CCC>
</BBB4>
</AAA>
This should be the result:
<AAA>
<BBB3>
<CCC>B</CCC>
</BBB3>
<BBB4>
<CCC>B</CCC>
</BBB4>
</AAA>
Hopefully someone can help me.
Jos

XPath is a query language for XML documents. As such it can only select nodes from existing XML document(s) -- it cannot modify an XML document or create a new XML document.
Use XSLT in order to transform an XML document and create a new XML document from it.
In this particular case:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/*/*[not(CCC = 'B')]"/>
</xsl:stylesheet>
when this transformation is applied on the provided XML document:
<AAA>
<BBB1>
<CCC>A</CCC>
</BBB1>
<BBB2>
<CCC>A</CCC>
</BBB2>
<BBB3>
<CCC>B</CCC>
</BBB3>
<BBB4>
<CCC>B</CCC>
</BBB4>
</AAA>
the wanted, correct result is produced:
<AAA>
<BBB3>
<CCC>B</CCC>
</BBB3>
<BBB4>
<CCC>B</CCC>
</BBB4>
</AAA>

In order to select all of the desired element and text nodes, use this XPATH:
//node()[.//CCC[.='B']
or self::CCC[.='B']
or self::text()[parent::CCC[.='B']]]
This could be achieved with a more simply/easily using XPATH with a modified identity transform XSLT:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" />
<!--Empty template for the content we want to redact -->
<xsl:template match="*[CCC[not(.='B')]]" />
<!--By default, copy all content forward -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

try this ,
"//CCC[text() = 'B']"
It shall give all CCC nodes where the innertext is B.

If you want to get AAA, BBB3 and BBB4 you can use the following
//*[descendant::CCC[text()='B']]
If BBB3 and BBB4 only then
//*[CCC[text()='B']]

Related

XSLT apply templates select condition on node list

I have an xml with a list and wanted to apply template on that which will send only specific nodes by a condition, but it is applying on the whole list. Could someone if I am missing anything, I am relatively new to XSL.
The condition I wanted to apply is if dep is 7 and no city tag exists, I started with condition to check if dep is 7. After apply template if i print my list, it is getting all of them, Instead of dep just with value 7.In my output I expect not to have dep with value 9.
Input XML:
<employeeList>
<employee>
<dep>7</dep>
<salary>900</salary>
</employee>
<employee>
<dep>7</dep>
<city>LA</city>
<salary>500</salary>
</employee>
<employee>
<dep>9</dep>
<salary>600</salary>
</employee>
<employee>
<dep>7</dep>
<salary>800</salary>
</employee>
</employeeList>
My XSL:
<xsl:apply-templates select="employeeList[employee/dep = '7']" mode="e"/>
<xsl:template match="employeeList" mode="e">
<xsl:for-each select="employee">
<dep>
<xsl:value-of select="dep" />
</dep>
</xsl:for-each>
Output XML:
<dep>7</dep><dep>7</dep><dep>9</dep><dep>7</dep>
The condition I wanted to apply is if dep is 7 and no city tag exists
Such condition can be easily implemented using e.g.:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/employeeList">
<root>
<xsl:for-each select="employee[dep='7' and not(city)]">
<dep>7</dep>
</xsl:for-each>
</root>
</xsl:template>
</xsl:stylesheet>
Or shortly:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/employeeList">
<root>
<xsl:copy-of select="employee[dep='7' and not(city)]/dep"/>
</root>
</xsl:template>
</xsl:stylesheet>
But it's hard to see the point in outputting X number of <dep>7</dep> elements.
You select the employeeList based on a condition on its employee/dep, but once you have selected it, that condition no longer matters, and the <xsl:for-each select="employee"> selects all employees, regardless of their dep.
You can repeat the condition in the xsl:for-each statement:
<xsl:for-each select="employee[dep = '7']">

Passing parameters from script to XSL

Using XSLT2 with the latest Saxon HE.
I'm trying to pass multiple coordinate parameters from a script to XSL in order to filter results based on a location boundary box
Script:
java -jar saxon9he.jar -s:litter_bins.xml -o:"bins.xml" -xsl:"Split xml coords.xsl" Coord_2=51.3725 Coord_4=51.3751 Coord_1=-2.3615 Coord_3=-2.3572
XSL:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="Coord_2" select="Coord_2"/>
<xsl:param name="Coord_4" select="Coord_4"/>
<xsl:param name="Coord_1" select="Coord_1"/>
<xsl:param name="Coord_3" select="Coord_3"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="node[#lat[ . < $Coord_2 or . > $Coord_4 ] or #lon[ . < $Coord_1 or . > $Coord_3]]"/>
</xsl:stylesheet>
The above returns:
<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="JOSM"/>
However if I hard code the coordinates into the match xpath, it returns the expected results.
Xpath:
<xsl:template match="node[#lat[ . < 51.3725 or . > 51.3751 ] or #lon[ . < -2.3615 or . > -2.3572]]"/>
Results:
<?xml version="1.0" encoding="UTF-8"?>
<osm version="0.6" generator="JOSM">
<node id="-102973" visible="true" lat="51.37283499216" lon="-2.359890029">
<tag k="date_creat" v="17/07/2014 07:59:04 AM UTC"/>
<tag k="form_recor" v="888"/>
</node>
<snip...>
</osm>
What am I misunderstanding?
Try to declare a numeric type for the parameters e.g. <xsl:param name="Coord_2" as="xs:double"/> or <xsl:param name="Coord_2" as="xs:decimal"/>. Of course for that your stylesheet needs to declare xmlns:xs="http://www.w3.org/2001/XMLSchema" as a namespace declaration on the root element.
Without a numeric type I think the comparison will be of two xs:untypedAtomic values and then https://www.w3.org/TR/xpath-31/#id-general-comparisons demands
If both atomic values are instances of xs:untypedAtomic, then the
values are cast to the type xs:string
and then the string comparison of negative numbers fails to give you the wanted result.

xslt Merge children of 2 parents and Store in a variable

I receive an xml input like this:
<root>
<Tuple1>
<child11></child11>
<child12></child12>
<child13></child13>
</Tuple1>
<Tuple1>
<child11></child11>
<child12></child12>
</Tuple1>
<Tuple2>
<child21></child21>
<child22></child22>
</Tuple2>
<Tuple2>
<child21></child21>
<child22></child22>
<child23></child23>
</Tuple2>
</root>
How can I merge the children of each Tuple1 with children of Tuple2 and store them in a variable that will be used in the rest of xslt document?
First tuple1 will be merged with first Tuple2 and second Tuple1 will be merged with 2nd Tuple2 and so on. The merged output that should be stored in variable would look like this in memory:
<root>
<Tuple1>
<child11></child11>
<child12></child12>
<child13></child13>
<child21></child21>
<child22></child22>
</Tuple1>
<Tuple1>
<child11></child11>
<child12></child12>
<child21></child21>
<child22></child22>
<child23></child23>
</Tuple1>
</root>
Is variable the best option? If we use variable, is it created once or it is created every time called?
I use xslt 3.0 so solution for any version can help.
Thanks and I appreciate your help)
Here is a minimal XSLT 3 approach:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="3.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="root">
<xsl:variable name="temp1">
<xsl:copy>
<xsl:apply-templates select="Tuple1"/>
</xsl:copy>
</xsl:variable>
<xsl:copy-of select="$temp1"/>
</xsl:template>
<xsl:template match="Tuple1">
<xsl:copy>
<xsl:copy-of select="*, let $pos := position() return ../Tuple2[$pos]/*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Online at https://xsltfiddle.liberty-development.net/bdxtqg, I have used XPath's let instead of XSLT's xsl:variable to store the position to access the specific Tuple2.

How do I use the msxsl:node-set to get a node set that I can use in a template parameter?

TL;DR; Why can't I use the element name in the XPATH going against a msxsl:node-set? It always returns nothing, as if the node-set is empty, when debugging shows that it is not empty.
Details: I need to use a node-set in an XSLT 1.0 document because my source XML is missing an important node. Instead of having to rewrite the entire XSLT, I'd like to instead inject a node-set so that my XSLT processing can continue as normal. I would like to use XPATH on the node-set but I am not able to use the actual element names, instead only a * works, but I am not sure why, or how I can access the actual element names in the XPATH.
Here is my XML (example only, the XML document here is the least important, see XSLT):
<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="generic.xslt" ?>
<ParentNode xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" i:noNamespaceSchemaLocation="generic.xsd">
<SomeChildNode>text</SomeChildNode>
</ParentNode>
Here is my XSLT:
<?xml version="1.0" encoding="utf-16"?>
<xsl:stylesheet version="1.0" xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2" xmlns:msxsl="urn:schemas-microsoft-com:xslt" xmlns:a="http://schemas.datacontract.org/2004/07/MeM.BizEntities" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<xsl:output method="xml" indent="yes" encoding="utf-16" omit-xml-declaration="no" />
<!-- Global Variables, used in multiple places -->
<xsl:variable name="empty"/>
<!-- Match Templates -->
<xsl:template match="ParentNode">
<ArrayOfSalesOrder>
<xsl:for-each select="SomeChildNode">
<xsl:call-template name="SomeChildNodeTemplate">
<xsl:with-param name="order" select="."/>
</xsl:call-template>
</xsl:for-each>
</ArrayOfSalesOrder>
</xsl:template>
<xsl:template name="SomeChildNodeTemplate">
<xsl:variable name="someRTF">
<Items>
<Item>
<Code>code</Code>
<Price>75</Price>
<Quantity>1</Quantity>
</Item>
<Item>
<Code>code2</Code>
<Price>100</Price>
<Quantity>3</Quantity>
</Item>
</Items>
</xsl:variable>
<xsl:call-template name="ItemsTemplate">
<xsl:with-param name="items" select="msxsl:node-set($someRTF)"/>
</xsl:call-template>
</xsl:template>
<xsl:template name="ItemsTemplate">
<xsl:param name="items"/>
<ItemsTransformed>
<xsl:for-each select="$items/Item">
<NewItem>
<NewCode>
<xsl:value-of select="Code"/>
</NewCode>
</NewItem>
</xsl:for-each>
</ItemsTransformed>
<ItemsTransformedThatWorksButNotHowIWant>
<xsl:for-each select="$items/*/*">
<NewItem>
<NewCode>
<xsl:value-of select="*[1]"/>
</NewCode>
<NewPrice>
<xsl:value-of select="*[2]"/>
</NewPrice>
<NewQuantity>
<xsl:value-of select="*[3]"/>
</NewQuantity>
</NewItem>
</xsl:for-each>
</ItemsTransformedThatWorksButNotHowIWant>
</xsl:template>
</xsl:stylesheet>
I would expect to be able to use XPATH to query into the node-set such that I can use their proper element names. This doesn't seem to be the case, and I'm struggling to understand why. I know there can be namespacing issues, but trying *:Item etc. doesn't work for me. I am able to use *[local-name()='Item'] but this seems like a horrible work around, not to mention that I'll have to rewrite any downstream templates and that is what I'm trying to avoid by using the node-set in the first place.
Result:
<?xml version="1.0" encoding="utf-16"?>
<ArrayOfSalesOrder xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2" xmlns:msxsl="urn:schemas-microsoft-com:xslt" xmlns:a="http://schemas.datacontract.org/2004/07/MeM.BizEntities" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<ItemsTransformed />
<ItemsTransformedThatWorksButNotHowIWant>
<NewItem>
<NewCode>code</NewCode>
<NewPrice>75</NewPrice>
<NewQuantity>1</NewQuantity>
</NewItem>
<NewItem>
<NewCode>code2</NewCode>
<NewPrice>100</NewPrice>
<NewQuantity>3</NewQuantity>
</NewItem>
</ItemsTransformedThatWorksButNotHowIWant>
</ArrayOfSalesOrder>
As you can see, I can get it to work with * but this is not very usable on a more complex structure. What am I doing wrong? Does this have to do with namespaces?
I would expect to see something under the <ItemsTransformed /> node, but instead it is just empty, and so far I can't get anything except the * to work.
The SO question below is what I was using, I thought I had an answer there, but I can't get the XPATH to work.
Reference:
XSLT 1.0 - Create node set and pass as a parameter
The problem here is that your stylesheet has a default namespace:
xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2"
Therefore, when you do:
<xsl:variable name="someRTF">
<Items>
<Item>
<Code>code</Code>
<Price>75</Price>
<Quantity>1</Quantity>
</Item>
<Item>
<Code>code2</Code>
<Price>100</Price>
<Quantity>3</Quantity>
</Item>
</Items>
</xsl:variable>
you are populating your variable with elements in the default namespace, so the variable actually contains:
<Items xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2">
<Item>
<Code>code</Code>
<Price>75</Price>
<Quantity>1</Quantity>
</Item>
<Item>
<Code>code2</Code>
<Price>100</Price>
<Quantity>3</Quantity>
</Item>
</Items>
Naturally, when you try later to select something like:
<xsl:for-each select="xyz:node-set($someRTF)/Items/Item">
you select nothing, because both Items and Item are in the default namespace and you're not calling them by their fully qualified name.
--- edit: ---
The problem can be easily solved by making sure that the root element of the variable - and by extension, all its descendants - are in no namespace.
Here's a simplified example (will run with any input):
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2"
xmlns:exsl="http://exslt.org/common"
exclude-result-prefixes="exsl">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:variable name="someRTF">
<Items xmlns="">
<Item>
<Code>code</Code>
<Price>75</Price>
<Quantity>1</Quantity>
</Item>
<Item>
<Code>code2</Code>
<Price>100</Price>
<Quantity>3</Quantity>
</Item>
</Items>
</xsl:variable>
<xsl:template match="/">
<ArrayOfSalesOrder>
<ItemsTransformed>
<xsl:for-each select="exsl:node-set($someRTF)/Items/Item">
<NewItem>
<NewCode>
<xsl:value-of select="Code"/>
</NewCode>
</NewItem>
</xsl:for-each>
</ItemsTransformed>
</ArrayOfSalesOrder>
</xsl:template>
</xsl:stylesheet>
Result:
<?xml version="1.0" encoding="UTF-8"?>
<ArrayOfSalesOrder xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2">
<ItemsTransformed>
<NewItem>
<NewCode>code</NewCode>
</NewItem>
<NewItem>
<NewCode>code2</NewCode>
</NewItem>
</ItemsTransformed>
</ArrayOfSalesOrder>

How to order self-referencing xml

I have a list of order lines with each one product on them. The products in may form a self-referencing hierarchy. I need to order the lines in such a way that all products that have no parent or whose parent is missing from the order are at the top, followed by their children. No child may be above its parent in the end result.
So how can i order the following xml:
<order>
<line><product code="3" parent="1"/></line>
<line><product code="2" parent="1"/></line>
<line><product code="6" parent="X"/></line>
<line><product code="1" /></line>
<line><product code="4" parent="2"/></line>
</order>
Into this:
<order>
<line><product code="6" parent="X"/></line>
<line><product code="1" /></line>
<line><product code="2" parent="1"/></line>
<line><product code="3" parent="1"/></line>
<line><product code="4" parent="2"/></line>
</order>
Note that the order within a specific level is not important, as long as the child node follows at some point after it's parent.
I have a solution which works for hierarchies that do not exceed a predefined depth:
<order>
<xsl:variable name="level-0"
select="/order/line[ not(product/#parent=../line/product/#code) ]"/>
<xsl:for-each select="$level-0">
<xsl:copy-of select="."/>
</xsl:for-each>
<xsl:variable name="level-1"
select="/order/line[ product/#parent=$level-0/product/#code ]"/>
<xsl:for-each select="$level-1">
<xsl:copy-of select="."/>
</xsl:for-each>
<xsl:variable name="level-2"
select="/order/line[ product/#parent=$level-1/product/#code ]"/>
<xsl:for-each select="$level-2">
<xsl:copy-of select="."/>
</xsl:for-each>
</order>
The above sample xslt will work for hierarchies with a maximum depth of 3 levels and is easily extended to more, but how can i generalize this and have the xslt sort arbitrary levels of depth correctly?
To start with, you could define a couple of keys to help you look up the line elements by either their code or parent attribute
<xsl:key name="products-by-parent" match="line" use="product/#parent" />
<xsl:key name="products-by-code" match="line" use="product/#code" />
You would start off by selecting the line elements with no parent, using a key to do this check:
<xsl:apply-templates select="line[not(key('products-by-code', product/#parent))]"/>
Then, within the template that matches the line element, you would just copy the element, and then select its "children" like so, using the other key
<xsl:apply-templates select="key('products-by-parent', product/#code)"/>
This would be a recursive call, so it would recursively look for its children until no more are found.
Try this XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:key name="products-by-parent" match="line" use="product/#parent"/>
<xsl:key name="products-by-code" match="line" use="product/#code"/>
<xsl:template match="order">
<xsl:copy>
<xsl:apply-templates select="line[not(key('products-by-code', product/#parent))]"/>
</xsl:copy>
</xsl:template>
<xsl:template match="line">
<xsl:call-template name="identity"/>
<xsl:apply-templates select="key('products-by-parent', product/#code)"/>
</xsl:template>
<xsl:template match="#*|node()" name="identity">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Do note the use of the XSLT identity transform to copy the existing nodes in the XML.
Very interesting problem. I would do this in two passes: first, nest the elements according to their hierarchy. Then output the elements, sorted by the count of their ancestors.
XSLT 1.0 (+ EXSLT node-set() function):
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
extension-element-prefixes="exsl">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:key name="product-by-code" match="product" use="#code" />
<!-- first pass -->
<xsl:variable name="nested">
<xsl:apply-templates select="/order/line/product[not(key('product-by-code', #parent))]" mode="nest"/>
</xsl:variable>
<xsl:template match="product" mode="nest">
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:apply-templates select="../../line/product[#parent=current()/#code]" mode="nest"/>
</xsl:copy>
</xsl:template>
<!-- output -->
<xsl:template match="/order">
<xsl:copy>
<xsl:for-each select="exsl:node-set($nested)//product">
<xsl:sort select="count(ancestor::*)" data-type="number" order="ascending"/>
<line><product><xsl:copy-of select="#*"/></product></line>
</xsl:for-each>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
When applied to your input, the result is:
<?xml version="1.0" encoding="UTF-8"?>
<order>
<line>
<product code="6" parent="X"/>
</line>
<line>
<product code="1"/>
</line>
<line>
<product code="3" parent="1"/>
</line>
<line>
<product code="2" parent="1"/>
</line>
<line>
<product code="4" parent="2"/>
</line>
</order>
This still leaves the issue of the existing/missing parent X - I will try to address that later.

Resources