how can I select the text in the parent node after the current tag in xsl? - xpath

I have the following xsl:
<titleGroup>
<title type="main" xml:lang="en">Synthesis of <i>N</i>‐Heterocyclic Carbenes and Their Complexes by
Chloronium Ion Abstraction from 2‐Chloroazolium Salts Using Electron‐Rich Phosphines
</title>
</titleGroup>
if I'm at the template that calls "i" how can I check the value "‐Heterocyclic Carbenes and Their Complexes by
Chloronium Ion Abstraction from 2‐Chloroazolium Salts Using Electron‐Rich Phosphines" in XSL I want the part which is after the current node only, not the part before?

From the context of i, the instruction:
<xsl:value-of select="following-sibling::text()[1]"/>
will return the value of the closest following sibling text node.

Related

Is it possible to select the properties of a node a XPATH?

I have an XML of the form:
<articleslist>
<articles>
<originalId>507948</originalId>
<title>Hogan Lovells Training Contract</title>
<slug>hogan-lovells-training-contract</slug>
<metaTitle>Hogan Lovells Training Contract</metaTitle>
<metaDescription>Find out about the Hogan Lovells Training Contract and Application Process</metaDescription>
<language>en</language>
<disableAds>false</disableAds>
<shortUrl>false</shortUrl>
<category_slug>law</category_slug>
<subcategory_slug>industry</subcategory_slug>
<updatedAt>2021-03-15T18:38:51.058+00:00</updatedAt>
<createdAt>2018-11-29T06:42:51.665+00:00</createdAt>
</articles>
</articlelist>
I'm able to select the row values with the XPATH //articles.
How can I select the child properties of articles (i.e. the column headings), so I get back a list of the form:
originalId
title
slug
etc...
Depends on your XPath version.
In XPath 2.0 it's simply //articles/*/name()
In 1.0 it's not possible because there's no such data type as a "sequence of strings". You would have to return the set of elements as //articles/*, and then extract their names in the calling program.

How to get parent element with attribute using xpath

I have posted sample XML and expected output kindly help to get the result.
Sample XML
<root>
<A id="1">
<B id="2"/>
<C id="2"/>
</A>
</root>
Expected output:
<A id="1"/>
You can formulate this query in several ways:
Find elements that have a matching attribute, only ascending all the time:
//*[#id=1]
Find the attribute, then ascend a step:
//#id[.=1]/..
Use the fn:id($id) function, given the document is validated and the ID-attribute is defined as such:
/id('1')
I think it's not possible what you're after. There's no way of selecting a node without its children using XPATH (meaning that it'd always return the nodes B and C in your case)
You could achieve this using XQuery, I'm not sure if this is what you want but here's an example where you create a new node based on an existing node that's stored in the $doc variable.
declare variable $doc := <root><A id="1"><B id="2"/><C id="2"/></A></root>;
element {fn:node-name($doc/*)} {$doc/*/#*}
The above returns <A id="1"></A>.
is that what you are looking for?
//*[#id='1']/parent::* , similar to //*[#id='1']/../
if you want to verify that parent is root :
//*[#id='1']/parent::root
https://en.wikipedia.org/wiki/XPath
if you need not just parent - but previous element with some attribute: Read about Axis specifiers and use Axis "ancestor::" =)

Self axis in xslt

<element>
<bye>do not delete me</bye>
<hello>do not delete me</hello>
<hello>delete me</hello>
<hello>delete me</hello>
</element>
Applied to the above xml, this deletes all the nodes except the first hello child of /element:
<xsl:template match="hello[not(current() = parent::element/hello[1])]" />
Why these ones doesn't work? (assuming the first node is not a text node)
<xsl:template match="hello[not(self::hello/position() = 1)]" />
<xsl:template match="hello[not(./position() = 1)]" />
Or this one?
<xsl:template match="hello[not(self::hello[1])]" />
What is the self axis selecting? Why isn't this last example equivalent to not(hello[1])?
First, you are wrong when you say that:
This deletes all the nodes except the first hello child of /element
The truth is that it deletes (if that's the correct word) any hello child of /element whose value is not the same as the value of the first one of these. For example, given:
XML
<element>
<hello>a</hello>
<hello>b</hello>
<hello>c</hello>
<hello>a</hello>
</element>
the template:
<xsl:template match="hello[not(current() = parent::element/hello[1])]" />
will match the second and the third hello nodes - but not the first or the fourth.
Now, with regard to your question: in XSLT 1.0, position() is not a valid location step - so this:
<xsl:template match="hello[not(self::hello/position() = 1)]" />
should return an error.
In XSLT 2.0, the pattern hello[not(self::hello/position() = 1)] will not match any hello element - because there is only one node on the self axis, and therefore its position is always 1.
Similarly:
<xsl:template match="hello[not(./position() = 1)]" />
is invalid in XSLT 1.0.
In XSLT 2.0, ./position() will always return 1 for the same reason as before: . is short for self::node() and there is only one such node.
Finally, this template:
<xsl:template match="hello[not(self::hello[1])]" />
is looking for a node that doesn't have (the first instance of) itself. Of course, no such node can exist.
Using position() on the RHS of the "/" operator is never useful -- and in XSLT 1.0, which is the tag on your question, it's not actually permitted.
In XSLT 2.0, the result of the expression X/position() is a sequence of integers 1..count(X). If the LHS is a singleton, like self::E, then count(X) is one so the result is a single integer 1.

Sorting XPath results in the same order as multiple select parameters

I have an XML document as follows:
<objects>
<object uid="0" />
<object uid="1" />
<object uid="2" />
</objects>
I can select multiple elements using the following query:
doc.xpath("//object[#uid=2 or #uid=0 or #uid=1]")
But this returns the elements in the same order they're declared in the XML document (uid=0, uid=1, uid=2) and I want the results in the same order as I perform the XPath query (uid=2, uid=0, uid=1).
I'm unsure if this is possible with XPath alone, and have looked into XSLT sorting, but I haven't found an example that explains how I could achieve this.
I'm working in Ruby with the Nokogiri library.
There is no way in XPath 1.0 to specify the order of the selected nodes.
XPath 2.0 allows a sequence of nodes with any specific order:
//object[#uid=2], //object[#uid=1]
evaluates to a sequence in which all object items with #uid=2 precede all object items with #uid=1
If one doesn't have anXPath 2.0 engine available, it is still possible to use XSLT in order to output nodes in any desired order.
In this specific case the sequence of the following XSLT instructions:
<xsl:copy-of select="//object[#uid=2]"/>
<xsl:copy-of select="//object[#uid=1]"/>
produces the desired output:
<object uid="2" /><object uid="1" />
I am assuming you are using XPath 1.0. The W3C spec says:
The primary syntactic construct in XPath is the expression. An expression matches the production Expr. An expression is evaluated to yield an object, which has one of the following four basic types:
* node-set (an unordered collection of nodes without duplicates)
* boolean (true or false)
* number (a floating-point number)
* string (a sequence of UCS characters)
So I don't think you can re-order simply using XPath. (The rest of the spec defines document order and reverse document order, so if the latter does what you want you can get it using the appropriate axis (e.g. preceding).
In XSLT you can use <xsl:sort> using the name() of the attribute. The XSLT FAQ is very good and you should find an answer there.
An XSLT example:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:param name="pSequence" select="'2 1'"/>
<xsl:template match="objects">
<xsl:for-each select="object[contains(concat(' ',$pSequence,' '),
concat(' ',#uid,' '))]">
<xsl:sort select="substring-before(concat(' ',$pSequence,' '),
concat(' ',#uid,' '))"/>
<xsl:copy-of select="."/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Output:
<object uid="2" /><object uid="1" />
I don't think there is a way to do it in xpath but if you wish to switch to XSLT you can use the xsl:sort tag:
<xsl:for-each select="//object[#uid=1 or #uid=2]">
<xsl:sort: select="#uid" data-type="number" />
{insert new logic here}
</xsl:for-each>
more complete info here:
http://www.w3schools.com/xsl/el_sort.asp
This is how I'd do it in Nokogiri:
require 'nokogiri'
xml = '<objects><object uid="0" /><object uid="1" /><object uid="2" /></objects>'
doc = Nokogiri::XML(xml)
objects_by_uid = doc.search('//object[#uid="2" or #uid="1"]').sort_by { |n| n['uid'].to_i }.reverse
puts objects_by_uid
Running that outputs:
<object uid="2"/>
<object uid="1"/>
An alternative to the search would be:
objects_by_uid = doc.search('//object[#uid="2" or #uid="1"]').sort { |a,b| b['uid'].to_i <=> a['uid'].to_i }
if you don't like using sort_by with the reverse.
XPath is useful for locating and retrieving the nodes but often the filtering we want to do gets too convoluted in the accessor so I let the language do it, whether it's Ruby, Perl or Python. Where I put the filtering logic is based on how big the XML data set is and whether there are a lot of different uid values I'll want to grab. Sometimes letting the XPath engine do the heavy lifting makes sense, other times its easier to let XPath grab all the object nodes and filter in the calling language.

XPath 1 query and attributes name

First question: is there any way to get the name of a node's attributes?
<node attribute1="value1" attribute2="value2" />
Second question: is there a way to get attributes and values as value pairs? The situation is the following:
<node attribute1="10" attribute2="0" />
I want to get all attributes where value>0 and this way: "attribute1=10".
First question: is there any way to
get the name of a node's attributes?
<node attribute1="value1"
attribute2="value2" />
Yes:
This XPath expression (when node is the context (current) node)):
name(#*[1])
produces the name of the first attribute (the ordering may be implementation - dependent)
and this XPath expression (when node is the context (current) node)):
name(#*[2])
produces the name of the second attribute (the ordering may be implementation - dependent).
Second question: is there a way to get
attributes and values as value pairs?
The situation is the following:
<node attribute1="10" attribute2="0"
/>
I want to get all attributes where
value>0 and this way: "attribute1=10".
This XPath expression (when the attribute named "attribute1" is the context (current) node)):
concat(name(), '=', .)
produces the string:
attribute1=value1
and this XPath expression (when the node node is the context (current) node)):
#*[. > 0]
selects all attributes of the context node, whose value is a number, greater than 0.
In XPath 2.0 one can combine them in a single XPath expression:
#*[number(.) > 0]/concat(name(.),'=',.)
to get (in this particular case) this result:
attribute1=10
If you are using XPath 1.0, which is less powerful, you'll need to embed the XPath expression in a hosting language, such as XSLT. The following XSLT 1.0 thransformation :
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/*">
<xsl:for-each select="#*[number(.) > 0]">
<xsl:value-of select="concat(name(.),'=',.)"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
when applied on this XML document:
<node attribute1="10" attribute2="0" />
Produces exactly the same result:
attribute1=10
It depends a little bit on the context, I believe. In most cases, I expect you'd have to query "#*", enumerate over the items, and call "name()" - but it may work in some tests.
Re the edit - you can do:
#*[number(.)>0]
to find attributes matching your criteria, and:
concat(name(),'=',.)
to display the output. I don't think you can do both at once, though. What is the context here? xslt? what?

Resources