What XPath function works to loop through repeating XML nodes.
This is my Source XML:
<?xml version="1.0" encoding="UTF-8"?>
<Record>
<Type>V</Type>
<Address>
<Qual>A</Qual>
<ID>A1</ID>
</Address>
<Address>
<Qual>A</Qual>
<ID>B2</ID>
</Address>
<Address>
<Qual>C</Qual>
<ID>C2</ID>
</Address>
<Category>
<EL>PO</EL>
</Category>
<Category>
<EL>DP</EL>
</Category>
</Record>
I don't want to process the data if Qualf=A & ID = B2, Category =DP & Type =V
My Xpath does not work due to repeating nodes..
(concat(Xpath./Type,Xpath./Record/Address/Qual,Xpath./Record/Address/ID,Xpath./Record/Category/EL) != "VAB2DP"
so I tried
choose((concat(Xpath./Type,Xpath./Record/Address/Qual,Xpath./Record/Address/ID,Xpath./Record/Category/EL) != "VAB2DP"),'true','false'
It still does not work.
Related
I use elementpath to handle some XPath queries. I have an XML with linear structure which contains a unique id attribute.
<items>
<item id="1">...</item>
<item id="2">...</item>
<item id="3">...</item>
... 500k elements
<item id="500003">...</item>
</items>
I want the parser to find the first occurence without traversing all the nodes. For example, I want to select //items/item[#id = '3'] and stop after iterating over 3 nodes only (not over 500k of nodes). It would be a nice optimization for many cases.
An example using XSLT 3 streaming with a static parameter for the XPath, then using xsl:iterate with xsl:break to produce the "early exit" once the first item sought has been found would be
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="3.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all">
<xsl:param name="path" static="yes" as="xs:string" select="'items/item[#id = ''3'']'"/>
<xsl:output method="xml"/>
<xsl:mode on-no-match="shallow-copy" streamable="yes"/>
<xsl:template match="/" name="xsl:initial-template">
<xsl:iterate _select="{$path}">
<xsl:if test="position() = 1">
<xsl:copy-of select="."/>
<xsl:break/>
</xsl:if>
</xsl:iterate>
</xsl:template>
</xsl:stylesheet>
You can run it with SaxonC EE (unfortunately streaming is only supported by EE) and Python with e.g.
import saxonc
with saxonc.PySaxonProcessor(license=True) as proc:
print("Test SaxonC on Python")
print(proc.version)
xslt30proc = proc.new_xslt30_processor()
xslt30proc.set_parameter('path', proc.make_string_value('/items/item[#id = "2"]'))
transformer = xslt30proc.compile_stylesheet(stylesheet_file='iterate-items-early-exit1.xsl')
xdm_result = transformer.apply_templates_returning_value(source_file='items-sample1.xml')
if transformer.exception_occurred:
print(transformer.error_message)
print(xdm_result)
input xml to XSLT 1.0
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<soapenv:Body>
<MESSAGE xmlns="http://MessangerApplication/datamodel">
<LAST_UPDATED_BY>-2</LAST_UPDATED_BY>
<LAST_UPDATE_DATE>2016-07-13T15:02:34.000-07:00</LAST_UPDATE_DATE>
<PREV_MODIFIED_DATE_ACC_CONTACT_IN_CUSTOM_TABLE>2016-07-13T15:02:34.000-07:00</PREV_MODIFIED_DATE_ACC_CONTACT_IN_CUSTOM_TABLE>
<ADDRESSES>
<ADDRESS>
<ADDR_ID>-10197894</ADDR_ID>
<LAST_UPDATED_BY>-2</LAST_UPDATED_BY>
<LAST_UPDATE_DATE>2016-07-13T15:54:19.000-07:00</LAST_UPDATE_DATE>
<PREV_MODIFIED_DATE_ADDR_IN_CUSTOM_TABLE/>
</ADDRESS>
<ADDRESS>
<LAST_UPDATED_BY>5309</LAST_UPDATED_BY>
<LAST_UPDATE_DATE>2016-07-13T15:55:32.000-07:00</LAST_UPDATE_DATE>
<PREV_MODIFIED_DATE_ADDR_IN_CUSTOM_TABLE>2016-07-13T16:53:21.000-07:00</PREV_MODIFIED_DATE_ADDR_IN_CUSTOM_TABLE>
</ADDRESS>
<ADDRESS>
<LAST_UPDATED_BY>5309</LAST_UPDATED_BY>
<LAST_UPDATE_DATE>2016-07-13T15:55:32.000-07:00</LAST_UPDATE_DATE>
<PREV_MODIFIED_DATE_ADDR_IN_CUSTOM_TABLE>2016-07-13T15:06:12.000-07:00</PREV_MODIFIED_DATE_ADDR_IN_CUSTOM_TABLE>
</ADDRESS>
</ADDRESSES>
<PHONES>
<PHONE>
<LAST_UPDATED_BY>-2</LAST_UPDATED_BY>
</PHONE>
</PHONES>
</MESSAGE>
</soapenv:Body>
</soapenv:Envelope>
below is my XSLT
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:ns="http://MessangerApplication/datamodel">
<xsl:template match="//ns:MESSAGE">
<ns:Account_Address>
<!-- START : First choose block -->
<xsl:choose>
<xsl:when
test="(('-2' != ns:LAST_UPDATED_BY or '-2' != ns:PHONES/ns:PHONE/ns:LAST_UPDATED_BY) and
(ns:PREV_MODIFIED_DATE_ACC_CONTACT_IN_CUSTOM_TABLE != ''))">
<xsl:variable name="currentDate" select="ns:LAST_UPDATE_DATE"/>
<xsl:variable name="prevDate" select="ns:PREV_MODIFIED_DATE_ACC_CONTACT_IN_CUSTOM_TABLE"/>
<testingAccount>true</testingAccount>
</xsl:when>
<xsl:otherwise>
<testingAccount>false</testingAccount>
</xsl:otherwise>
</xsl:choose>
<!-- END : First choose block -->
**<!-- START : Second choose block -->**
<xsl:choose>
<xsl:when
test="((ns:ADDRESSES/ns:ADDRESS[ns:LAST_UPDATED_BY] != '-2') and
(ns:ADDRESSES/ns:ADDRESS[ns:PREV_MODIFIED_DATE_ADDR_IN_CUSTOM_TABLE != '']))">
<!--
here i want to read LAST_UPDATE_DATE and PREV_MODIFIED_DATE_ADDR_IN_CUSTOM_TABLE
-->
</xsl:when>
<xsl:otherwise>
<addressTesting>false</addressTesting>
</xsl:otherwise>
</xsl:choose>
**<!-- END : Second choose block -->**
</ns:Account_Address>
</xsl:template>
</xsl:stylesheet>
So in XSLT (Second choose block) i want to read below values and stop traversing further when it finds the value of below two tags
1) LAST_UPDATE_DATE (path : ADDRESSES/LAST_UPDATE_DATE)
2) PREV_MODIFIED_DATE_ADDR_IN_CUSTOM_TABLE (path : ADDRESSES/PREV_MODIFIED_DATE_ADDR_IN_CUSTOM_TABLE)
please suggest how can i implement the logic in Second choose block (mentioned in XSLT), many thanks
I am using XSLT 1.0
I have XML that has a lot of duplicated values. I'd like to select all the rows with a specific section ("sec") and section tag ("sec_tag"), but I can't seem to get the XPath correct.
Here's a small snippet of the XML:
<root>
<record>
<sec>5</sec>
<sec_tag>919</sec_tag>
<nested_tag>
<info>Info</info>
<types>
<type>1</type>
<type>2</type>
<type>3</type>
</types>
</nested_tag>
<flags>00000000</flags>
</record>
<record>
<sec>5</sec>
<sec_tag>930</sec_tag>
<nested_tag>
<info>Info</info>
<types>
<type>1</type>
<type>2</type>
<type>3</type>
</types>
</nested_tag>
<flags>00000000</flags>
</record>
<record>
<sec>7</sec>
<sec_tag>919</sec_tag>
<nested_tag>
<info>Info</info>
<types>
<type>1</type>
<type>2</type>
<type>3</type>
</types>
</nested_tag>
<flags>00000000</flags>
</record>
</root>
I want the node that has <sec>5</sec> and <sec_tag>919</sec_tag>.
I tried something like this:
//sec[text(), "5"] and //sec_tag[text(), "919"]
Obviously that's not the correct syntax there, I just need to find the correct XPath expression.
You can use the following XPath expression to return record elements having child sec equals 5 and sec_tag equals 919 :
//record[sec = 5 and sec_tag = 919]
TL;DR; Why can't I use the element name in the XPATH going against a msxsl:node-set? It always returns nothing, as if the node-set is empty, when debugging shows that it is not empty.
Details: I need to use a node-set in an XSLT 1.0 document because my source XML is missing an important node. Instead of having to rewrite the entire XSLT, I'd like to instead inject a node-set so that my XSLT processing can continue as normal. I would like to use XPATH on the node-set but I am not able to use the actual element names, instead only a * works, but I am not sure why, or how I can access the actual element names in the XPATH.
Here is my XML (example only, the XML document here is the least important, see XSLT):
<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="generic.xslt" ?>
<ParentNode xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" i:noNamespaceSchemaLocation="generic.xsd">
<SomeChildNode>text</SomeChildNode>
</ParentNode>
Here is my XSLT:
<?xml version="1.0" encoding="utf-16"?>
<xsl:stylesheet version="1.0" xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2" xmlns:msxsl="urn:schemas-microsoft-com:xslt" xmlns:a="http://schemas.datacontract.org/2004/07/MeM.BizEntities" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<xsl:output method="xml" indent="yes" encoding="utf-16" omit-xml-declaration="no" />
<!-- Global Variables, used in multiple places -->
<xsl:variable name="empty"/>
<!-- Match Templates -->
<xsl:template match="ParentNode">
<ArrayOfSalesOrder>
<xsl:for-each select="SomeChildNode">
<xsl:call-template name="SomeChildNodeTemplate">
<xsl:with-param name="order" select="."/>
</xsl:call-template>
</xsl:for-each>
</ArrayOfSalesOrder>
</xsl:template>
<xsl:template name="SomeChildNodeTemplate">
<xsl:variable name="someRTF">
<Items>
<Item>
<Code>code</Code>
<Price>75</Price>
<Quantity>1</Quantity>
</Item>
<Item>
<Code>code2</Code>
<Price>100</Price>
<Quantity>3</Quantity>
</Item>
</Items>
</xsl:variable>
<xsl:call-template name="ItemsTemplate">
<xsl:with-param name="items" select="msxsl:node-set($someRTF)"/>
</xsl:call-template>
</xsl:template>
<xsl:template name="ItemsTemplate">
<xsl:param name="items"/>
<ItemsTransformed>
<xsl:for-each select="$items/Item">
<NewItem>
<NewCode>
<xsl:value-of select="Code"/>
</NewCode>
</NewItem>
</xsl:for-each>
</ItemsTransformed>
<ItemsTransformedThatWorksButNotHowIWant>
<xsl:for-each select="$items/*/*">
<NewItem>
<NewCode>
<xsl:value-of select="*[1]"/>
</NewCode>
<NewPrice>
<xsl:value-of select="*[2]"/>
</NewPrice>
<NewQuantity>
<xsl:value-of select="*[3]"/>
</NewQuantity>
</NewItem>
</xsl:for-each>
</ItemsTransformedThatWorksButNotHowIWant>
</xsl:template>
</xsl:stylesheet>
I would expect to be able to use XPATH to query into the node-set such that I can use their proper element names. This doesn't seem to be the case, and I'm struggling to understand why. I know there can be namespacing issues, but trying *:Item etc. doesn't work for me. I am able to use *[local-name()='Item'] but this seems like a horrible work around, not to mention that I'll have to rewrite any downstream templates and that is what I'm trying to avoid by using the node-set in the first place.
Result:
<?xml version="1.0" encoding="utf-16"?>
<ArrayOfSalesOrder xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2" xmlns:msxsl="urn:schemas-microsoft-com:xslt" xmlns:a="http://schemas.datacontract.org/2004/07/MeM.BizEntities" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<ItemsTransformed />
<ItemsTransformedThatWorksButNotHowIWant>
<NewItem>
<NewCode>code</NewCode>
<NewPrice>75</NewPrice>
<NewQuantity>1</NewQuantity>
</NewItem>
<NewItem>
<NewCode>code2</NewCode>
<NewPrice>100</NewPrice>
<NewQuantity>3</NewQuantity>
</NewItem>
</ItemsTransformedThatWorksButNotHowIWant>
</ArrayOfSalesOrder>
As you can see, I can get it to work with * but this is not very usable on a more complex structure. What am I doing wrong? Does this have to do with namespaces?
I would expect to see something under the <ItemsTransformed /> node, but instead it is just empty, and so far I can't get anything except the * to work.
The SO question below is what I was using, I thought I had an answer there, but I can't get the XPATH to work.
Reference:
XSLT 1.0 - Create node set and pass as a parameter
The problem here is that your stylesheet has a default namespace:
xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2"
Therefore, when you do:
<xsl:variable name="someRTF">
<Items>
<Item>
<Code>code</Code>
<Price>75</Price>
<Quantity>1</Quantity>
</Item>
<Item>
<Code>code2</Code>
<Price>100</Price>
<Quantity>3</Quantity>
</Item>
</Items>
</xsl:variable>
you are populating your variable with elements in the default namespace, so the variable actually contains:
<Items xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2">
<Item>
<Code>code</Code>
<Price>75</Price>
<Quantity>1</Quantity>
</Item>
<Item>
<Code>code2</Code>
<Price>100</Price>
<Quantity>3</Quantity>
</Item>
</Items>
Naturally, when you try later to select something like:
<xsl:for-each select="xyz:node-set($someRTF)/Items/Item">
you select nothing, because both Items and Item are in the default namespace and you're not calling them by their fully qualified name.
--- edit: ---
The problem can be easily solved by making sure that the root element of the variable - and by extension, all its descendants - are in no namespace.
Here's a simplified example (will run with any input):
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2"
xmlns:exsl="http://exslt.org/common"
exclude-result-prefixes="exsl">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:variable name="someRTF">
<Items xmlns="">
<Item>
<Code>code</Code>
<Price>75</Price>
<Quantity>1</Quantity>
</Item>
<Item>
<Code>code2</Code>
<Price>100</Price>
<Quantity>3</Quantity>
</Item>
</Items>
</xsl:variable>
<xsl:template match="/">
<ArrayOfSalesOrder>
<ItemsTransformed>
<xsl:for-each select="exsl:node-set($someRTF)/Items/Item">
<NewItem>
<NewCode>
<xsl:value-of select="Code"/>
</NewCode>
</NewItem>
</xsl:for-each>
</ItemsTransformed>
</ArrayOfSalesOrder>
</xsl:template>
</xsl:stylesheet>
Result:
<?xml version="1.0" encoding="UTF-8"?>
<ArrayOfSalesOrder xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2">
<ItemsTransformed>
<NewItem>
<NewCode>code</NewCode>
</NewItem>
<NewItem>
<NewCode>code2</NewCode>
</NewItem>
</ItemsTransformed>
</ArrayOfSalesOrder>
I have a large collection of xml documents with a wide array of different tags in them. I need to change all tags of the form <foo> and turn them into tags of the form <field name="foo"> in a way that will also ignore the attributes of a given tag. That is, a tag of the form <foo id="bar"> should also be changed to the tag <field name="foo">.
In order for this transformation to work, I also need to distinguish between <foo> and </foo>, as </foo> must go to </field>.
I have played around with sed in a bash script, but to no avail.
Although sed is not ideal for this task (see comments; further reading: regular, context-free grammar and xml), it can be pressed into service. Try this one-liner:
sed -e 's/<\([^>\/\ ]*\)[^>]*>/<field name=\"\1\">/g' -e 's/<field name=\"\">/<\/field>/g' file
First it will replace all end tags with </field>, then replace every open tag first words with <field name="firstStoredWord">
This solution prints everything on the standard output. If you want to replace it in file directly when processing, try
sed -i -e 's/<\([^>\/\ ]*\)[^>]*>/<field name=\"\1\">/g' -e 's/<field name=\"\">/<\/field>/g' file
That makes from
<html>
<person>
but <person name="bob"> and <person name="tom"> would both become
</person>
this
<field name="html">
<field name="person">
but <field name="person"> and <field name="person"> would both become
</field>
Sed is the wrong tool for the job - a simple XSL Transform can do this much more reliably:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="foo">
<field name="foo">
<xsl:apply-templates/>
</field>
</xsl:template>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Note that unlike sed, it can handle short empty elements, newlines within tags (e.g. as produced by some tools), and just about anything that's well-formed XML. Here's my test file:
<?xml version="1.0"?>
<doc>
<section>
<foo>Plain foo, simple content</foo>
</section>
<foo attr="0">Foo with attr, with content
<bar/>
<foo attr="shorttag"/>
</foo>
<foo
attr="1"
>multiline</foo
>
<![CDATA[We mustn't transform <foo> in here!]]>
</doc>
which is transformed by the above (using xsltproc 16970175.xslt 16970175.xml) to:
<?xml version="1.0"?>
<doc>
<section>
<field name="foo">Plain foo, simple content</field>
</section>
<field name="foo">Foo with attr, with content
<bar/>
<field name="foo"/>
</field>
<field name="foo">multiline</field>
We mustn't transform <foo> in here!
</doc>