XSLT first occurence of element based on duplicate value - for-loop

The given XML:
<records>
<record>
<country>Germany</country>
<value>123</value>
</record>
<record>
<country>Germany</country>
<value>62</value>
</record>
<record>
<country>Germany</country>
<value>033</value>
</record>
<record>
<country>Armenia</country>
<value>444</value>
</record>
<record>
<country>Armenia</country>
<value>212</value>
</record>
<record>
<country>Armenia</country>
<value>864</value>
</record>
</records>
How do I get an output, which respectively chooses every first occurrence of <record> by the value of <country>.
So the desired output should look like:
<records>
<record>
<country>Germany</country>
<value>123</value>
</record>
<record>
<country>Armenia</country>
<value>444</value>
</record>
</records>
UPDATE: Solved my Problem with given XSL
<xsl:key name="country" match="record" use="value" />
<xsl:template match="records">
<xsl:apply-templates select="record[1]" />
</xsl:template>
<xsl:template match="record">
<xsl:for-each select="key('country', value)">
<country><xsl:value-of select="country"/></country>
<value><xsl:value-of select="value"/></value>
</xsl:for-each>
</xsl:template>

Related

Xpath 1.0 nodelist based on node names

I don't like to ask for help, but this time I'm getting totally stuck with a xpath query.
Please have a look at this XML:
<doc>
<car>
<property id="color">
<attribute id="black" />
<attribute id="white" />
<attribute id="green" />
</property>
<property id="size">
<attribute id="small" />
<attribute id="medium" />
<attribute id="large" />
</property>
</car>
<attributes>
<color>white</color>
<size>small</size>
</attributes>
</doc>
The car/properties should be output according to the attributes nodenames. The desired output is:
<property id="color"><attribute id="white" /></property>
<property id="size"><attribute id="small" /></property>
The xpath
/doc/car/property[#id=name(/doc/attributes/*)]/attribute[#id=/doc/attributes/*/text()]
results only the first node, because the name() function returns only the name of the first element.
Who can help me to find out a working xpath (XSLT 1.0)? Many thanks for your help in advance!
You can achieve this with XSLT-1.0, but not only with XPath-1.0, because in XPath-1.0 you can only return the first item. This is not a problem in XSLT-1.0, because you can use an xsl:for-each loop, like the following:
<xsl:for-each select="/doc/attributes/*">
<property id="{/doc/car/property[#id=current()/local-name()]/#id}"><attribute id="{/doc/car/property[#id=current()/local-name()]/attribute[#id=current()/.]/#id}" /></property>
</xsl:for-each>
This code emits the following XML:
<property id="color"><attribute id="white"/></property>
<property id="size"><attribute id="small"/></property>
As seen, your requirements seem to be a little bit redundant, but I guess that your greater scenario justify the means.
What about these options (it's still unclear to me why you're using name() since I don't see any namespace in your sample data) :
//property|//attribute[#id=//attributes/*]
//attribute[#id=//attributes/*]|//attribute[#id=//attributes/*]/parent::property
//property|//attribute[#id=substring-before(normalize-space(//attributes)," ") or #id=substring-after(normalize-space(//attributes)," ")]
Third option should work even if you have to deal with a namespace for the #id inside the attributes node.
Output :
My working solution:
<xsl:stylesheet version="1.0">
<xsl:template match="/">
<xsl:for-each select="/doc/car/property">
<property id="{#id}">
<xsl:variable name="id" select="#id" />
<xsl:copy-of select="attribute[#id=/doc/attributes/*[name()=$id]/text()]" />
</property>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Another solution without using a loop:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:apply-templates select="doc/car/property"/>
</xsl:template>
<xsl:template match="property">
<xsl:copy>
<xsl:copy-of select="#*"/>
<xsl:copy-of select="attribute[#id = /doc/attributes/*[name() = current()/#id]]"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
For each property it copies the element node and its attributes. Then it copies its attribute children having an id matching the respective element below /doc/attributes.

XPath results based on two nodes

I have XML that has a lot of duplicated values. I'd like to select all the rows with a specific section ("sec") and section tag ("sec_tag"), but I can't seem to get the XPath correct.
Here's a small snippet of the XML:
<root>
<record>
<sec>5</sec>
<sec_tag>919</sec_tag>
<nested_tag>
<info>Info</info>
<types>
<type>1</type>
<type>2</type>
<type>3</type>
</types>
</nested_tag>
<flags>00000000</flags>
</record>
<record>
<sec>5</sec>
<sec_tag>930</sec_tag>
<nested_tag>
<info>Info</info>
<types>
<type>1</type>
<type>2</type>
<type>3</type>
</types>
</nested_tag>
<flags>00000000</flags>
</record>
<record>
<sec>7</sec>
<sec_tag>919</sec_tag>
<nested_tag>
<info>Info</info>
<types>
<type>1</type>
<type>2</type>
<type>3</type>
</types>
</nested_tag>
<flags>00000000</flags>
</record>
</root>
I want the node that has <sec>5</sec> and <sec_tag>919</sec_tag>.
I tried something like this:
//sec[text(), "5"] and //sec_tag[text(), "919"]
Obviously that's not the correct syntax there, I just need to find the correct XPath expression.
You can use the following XPath expression to return record elements having child sec equals 5 and sec_tag equals 919 :
//record[sec = 5 and sec_tag = 919]

How do I use the msxsl:node-set to get a node set that I can use in a template parameter?

TL;DR; Why can't I use the element name in the XPATH going against a msxsl:node-set? It always returns nothing, as if the node-set is empty, when debugging shows that it is not empty.
Details: I need to use a node-set in an XSLT 1.0 document because my source XML is missing an important node. Instead of having to rewrite the entire XSLT, I'd like to instead inject a node-set so that my XSLT processing can continue as normal. I would like to use XPATH on the node-set but I am not able to use the actual element names, instead only a * works, but I am not sure why, or how I can access the actual element names in the XPATH.
Here is my XML (example only, the XML document here is the least important, see XSLT):
<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="generic.xslt" ?>
<ParentNode xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" i:noNamespaceSchemaLocation="generic.xsd">
<SomeChildNode>text</SomeChildNode>
</ParentNode>
Here is my XSLT:
<?xml version="1.0" encoding="utf-16"?>
<xsl:stylesheet version="1.0" xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2" xmlns:msxsl="urn:schemas-microsoft-com:xslt" xmlns:a="http://schemas.datacontract.org/2004/07/MeM.BizEntities" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<xsl:output method="xml" indent="yes" encoding="utf-16" omit-xml-declaration="no" />
<!-- Global Variables, used in multiple places -->
<xsl:variable name="empty"/>
<!-- Match Templates -->
<xsl:template match="ParentNode">
<ArrayOfSalesOrder>
<xsl:for-each select="SomeChildNode">
<xsl:call-template name="SomeChildNodeTemplate">
<xsl:with-param name="order" select="."/>
</xsl:call-template>
</xsl:for-each>
</ArrayOfSalesOrder>
</xsl:template>
<xsl:template name="SomeChildNodeTemplate">
<xsl:variable name="someRTF">
<Items>
<Item>
<Code>code</Code>
<Price>75</Price>
<Quantity>1</Quantity>
</Item>
<Item>
<Code>code2</Code>
<Price>100</Price>
<Quantity>3</Quantity>
</Item>
</Items>
</xsl:variable>
<xsl:call-template name="ItemsTemplate">
<xsl:with-param name="items" select="msxsl:node-set($someRTF)"/>
</xsl:call-template>
</xsl:template>
<xsl:template name="ItemsTemplate">
<xsl:param name="items"/>
<ItemsTransformed>
<xsl:for-each select="$items/Item">
<NewItem>
<NewCode>
<xsl:value-of select="Code"/>
</NewCode>
</NewItem>
</xsl:for-each>
</ItemsTransformed>
<ItemsTransformedThatWorksButNotHowIWant>
<xsl:for-each select="$items/*/*">
<NewItem>
<NewCode>
<xsl:value-of select="*[1]"/>
</NewCode>
<NewPrice>
<xsl:value-of select="*[2]"/>
</NewPrice>
<NewQuantity>
<xsl:value-of select="*[3]"/>
</NewQuantity>
</NewItem>
</xsl:for-each>
</ItemsTransformedThatWorksButNotHowIWant>
</xsl:template>
</xsl:stylesheet>
I would expect to be able to use XPATH to query into the node-set such that I can use their proper element names. This doesn't seem to be the case, and I'm struggling to understand why. I know there can be namespacing issues, but trying *:Item etc. doesn't work for me. I am able to use *[local-name()='Item'] but this seems like a horrible work around, not to mention that I'll have to rewrite any downstream templates and that is what I'm trying to avoid by using the node-set in the first place.
Result:
<?xml version="1.0" encoding="utf-16"?>
<ArrayOfSalesOrder xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2" xmlns:msxsl="urn:schemas-microsoft-com:xslt" xmlns:a="http://schemas.datacontract.org/2004/07/MeM.BizEntities" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<ItemsTransformed />
<ItemsTransformedThatWorksButNotHowIWant>
<NewItem>
<NewCode>code</NewCode>
<NewPrice>75</NewPrice>
<NewQuantity>1</NewQuantity>
</NewItem>
<NewItem>
<NewCode>code2</NewCode>
<NewPrice>100</NewPrice>
<NewQuantity>3</NewQuantity>
</NewItem>
</ItemsTransformedThatWorksButNotHowIWant>
</ArrayOfSalesOrder>
As you can see, I can get it to work with * but this is not very usable on a more complex structure. What am I doing wrong? Does this have to do with namespaces?
I would expect to see something under the <ItemsTransformed /> node, but instead it is just empty, and so far I can't get anything except the * to work.
The SO question below is what I was using, I thought I had an answer there, but I can't get the XPATH to work.
Reference:
XSLT 1.0 - Create node set and pass as a parameter
The problem here is that your stylesheet has a default namespace:
xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2"
Therefore, when you do:
<xsl:variable name="someRTF">
<Items>
<Item>
<Code>code</Code>
<Price>75</Price>
<Quantity>1</Quantity>
</Item>
<Item>
<Code>code2</Code>
<Price>100</Price>
<Quantity>3</Quantity>
</Item>
</Items>
</xsl:variable>
you are populating your variable with elements in the default namespace, so the variable actually contains:
<Items xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2">
<Item>
<Code>code</Code>
<Price>75</Price>
<Quantity>1</Quantity>
</Item>
<Item>
<Code>code2</Code>
<Price>100</Price>
<Quantity>3</Quantity>
</Item>
</Items>
Naturally, when you try later to select something like:
<xsl:for-each select="xyz:node-set($someRTF)/Items/Item">
you select nothing, because both Items and Item are in the default namespace and you're not calling them by their fully qualified name.
--- edit: ---
The problem can be easily solved by making sure that the root element of the variable - and by extension, all its descendants - are in no namespace.
Here's a simplified example (will run with any input):
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2"
xmlns:exsl="http://exslt.org/common"
exclude-result-prefixes="exsl">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:variable name="someRTF">
<Items xmlns="">
<Item>
<Code>code</Code>
<Price>75</Price>
<Quantity>1</Quantity>
</Item>
<Item>
<Code>code2</Code>
<Price>100</Price>
<Quantity>3</Quantity>
</Item>
</Items>
</xsl:variable>
<xsl:template match="/">
<ArrayOfSalesOrder>
<ItemsTransformed>
<xsl:for-each select="exsl:node-set($someRTF)/Items/Item">
<NewItem>
<NewCode>
<xsl:value-of select="Code"/>
</NewCode>
</NewItem>
</xsl:for-each>
</ItemsTransformed>
</ArrayOfSalesOrder>
</xsl:template>
</xsl:stylesheet>
Result:
<?xml version="1.0" encoding="UTF-8"?>
<ArrayOfSalesOrder xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2">
<ItemsTransformed>
<NewItem>
<NewCode>code</NewCode>
</NewItem>
<NewItem>
<NewCode>code2</NewCode>
</NewItem>
</ItemsTransformed>
</ArrayOfSalesOrder>

Replacing xml tags in BASH

I have a large collection of xml documents with a wide array of different tags in them. I need to change all tags of the form <foo> and turn them into tags of the form <field name="foo"> in a way that will also ignore the attributes of a given tag. That is, a tag of the form <foo id="bar"> should also be changed to the tag <field name="foo">.
In order for this transformation to work, I also need to distinguish between <foo> and </foo>, as </foo> must go to </field>.
I have played around with sed in a bash script, but to no avail.
Although sed is not ideal for this task (see comments; further reading: regular, context-free grammar and xml), it can be pressed into service. Try this one-liner:
sed -e 's/<\([^>\/\ ]*\)[^>]*>/<field name=\"\1\">/g' -e 's/<field name=\"\">/<\/field>/g' file
First it will replace all end tags with </field>, then replace every open tag first words with <field name="firstStoredWord">
This solution prints everything on the standard output. If you want to replace it in file directly when processing, try
sed -i -e 's/<\([^>\/\ ]*\)[^>]*>/<field name=\"\1\">/g' -e 's/<field name=\"\">/<\/field>/g' file
That makes from
<html>
<person>
but <person name="bob"> and <person name="tom"> would both become
</person>
this
<field name="html">
<field name="person">
but <field name="person"> and <field name="person"> would both become
</field>
Sed is the wrong tool for the job - a simple XSL Transform can do this much more reliably:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="foo">
<field name="foo">
<xsl:apply-templates/>
</field>
</xsl:template>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Note that unlike sed, it can handle short empty elements, newlines within tags (e.g. as produced by some tools), and just about anything that's well-formed XML. Here's my test file:
<?xml version="1.0"?>
<doc>
<section>
<foo>Plain foo, simple content</foo>
</section>
<foo attr="0">Foo with attr, with content
<bar/>
<foo attr="shorttag"/>
</foo>
<foo
attr="1"
>multiline</foo
>
<![CDATA[We mustn't transform <foo> in here!]]>
</doc>
which is transformed by the above (using xsltproc 16970175.xslt 16970175.xml) to:
<?xml version="1.0"?>
<doc>
<section>
<field name="foo">Plain foo, simple content</field>
</section>
<field name="foo">Foo with attr, with content
<bar/>
<field name="foo"/>
</field>
<field name="foo">multiline</field>
We mustn't transform <foo> in here!
</doc>

Problem in sorting xml in XSLT 2.0 any ideas?

Hello I am trying to sort my xml by number of occurence of element 'answer' with attribute 'id' and get simply summary.
<person id="1">
<answer id="A"/>
<answer id="B"/>
</person>
<person id="2">
<answer id="A"/>
<answer id="C"/>
</person>
<person id="3">
<answer id="C"/>
</person>
I want simply summary text on output:
A = 2 time(s)
C = 2 time(s)
B = 1 time(s)
In XSLT 2.0 i tried:
<xsl:for-each select="distinct-values(/person/answer)">
<xsl:sort select="count(/person/answer)" data-type="number"/>
<xsl:value-of select="./#id"/> =
<xsl:value-of select="count(/person/answer[#id=./#id])"/> time(s)
</xsl:for-each>
but it is not working:
in XMLSpy 2008:
"Error in XPath 2.0 expression Not a node item"
in Saxon 9:
XPTY0020: Leading '/' cannot select the root node of the tree containing the context item: the context item is an atomic value
Failed to compile stylesheet. 1 error detected.
I would group and count the items in each group:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:for-each-group select="//person/answer" group-by="#id">
<xsl:sort select="count(current-group())" order="descending"/>
<xsl:value-of select="concat(current-grouping-key(), ' = ', count(current-group()), ' time(s).
')"/>
</xsl:for-each-group>
</xsl:template>
</xsl:stylesheet>
That way, when applied to the input
<root>
<person id="1">
<answer id="A"/>
<answer id="B"/>
</person>
<person id="2">
<answer id="A"/>
<answer id="C"/>
</person>
<person id="3">
<answer id="C"/>
</person>
</root>
I get the result
A = 2 time(s).
C = 2 time(s).
B = 1 time(s).

Resources