XPath select parent attribute and child element data - xpath

XPath newbie here.
Is there a way to select a list of attr1 and child3 pairs using XPath from the sample XML below?
In other words I need a list of a11, c13, a21, c23, etc.
Or I can only do //parent and pick required values from the array of resulting nodes?
<parentList>
<parent attr1="a11" attr2="a12">
<child1>c11</child1>
<child2>c12</child2>
<child3>c13</child3>
<child4>c14</child4>
<child5>c15</child5>
</parent>
<parent attr1="a21" attr2="22">
<child1>c21</child1>
<child2>c22</child2>
<child3>c23</child3>
<child4>c24</child4>
<child5>c25</child5>
</parent>
</parentList>

This xpath 2.0 expression (and properly closing your sample html)
//parent/concat(#attr1,' ',child3/text())
should output:
a11 c13
a21 c23

XPath 1.0 solution to select the elements of interest :
//parent/descendant-or-self::*[position()=1 or position()=4]
But, to generate a list of data you should use the following one :
(//#attr1|//text()[normalize-space()])[parent::parent or parent::*[parent::parent][count(preceding-sibling::*)=2]]
Look for an attribute or text with : parent element as parent or child of parent as parent and two preceding siblings.
Output : a11,c13,a21,c23

Related

Is it possible to select the properties of a node a XPATH?

I have an XML of the form:
<articleslist>
<articles>
<originalId>507948</originalId>
<title>Hogan Lovells Training Contract</title>
<slug>hogan-lovells-training-contract</slug>
<metaTitle>Hogan Lovells Training Contract</metaTitle>
<metaDescription>Find out about the Hogan Lovells Training Contract and Application Process</metaDescription>
<language>en</language>
<disableAds>false</disableAds>
<shortUrl>false</shortUrl>
<category_slug>law</category_slug>
<subcategory_slug>industry</subcategory_slug>
<updatedAt>2021-03-15T18:38:51.058+00:00</updatedAt>
<createdAt>2018-11-29T06:42:51.665+00:00</createdAt>
</articles>
</articlelist>
I'm able to select the row values with the XPATH //articles.
How can I select the child properties of articles (i.e. the column headings), so I get back a list of the form:
originalId
title
slug
etc...
Depends on your XPath version.
In XPath 2.0 it's simply //articles/*/name()
In 1.0 it's not possible because there's no such data type as a "sequence of strings". You would have to return the set of elements as //articles/*, and then extract their names in the calling program.

How to get parent element with attribute using xpath

I have posted sample XML and expected output kindly help to get the result.
Sample XML
<root>
<A id="1">
<B id="2"/>
<C id="2"/>
</A>
</root>
Expected output:
<A id="1"/>
You can formulate this query in several ways:
Find elements that have a matching attribute, only ascending all the time:
//*[#id=1]
Find the attribute, then ascend a step:
//#id[.=1]/..
Use the fn:id($id) function, given the document is validated and the ID-attribute is defined as such:
/id('1')
I think it's not possible what you're after. There's no way of selecting a node without its children using XPATH (meaning that it'd always return the nodes B and C in your case)
You could achieve this using XQuery, I'm not sure if this is what you want but here's an example where you create a new node based on an existing node that's stored in the $doc variable.
declare variable $doc := <root><A id="1"><B id="2"/><C id="2"/></A></root>;
element {fn:node-name($doc/*)} {$doc/*/#*}
The above returns <A id="1"></A>.
is that what you are looking for?
//*[#id='1']/parent::* , similar to //*[#id='1']/../
if you want to verify that parent is root :
//*[#id='1']/parent::root
https://en.wikipedia.org/wiki/XPath
if you need not just parent - but previous element with some attribute: Read about Axis specifiers and use Axis "ancestor::" =)

Use xpath to locate a complex element with attributes and children

Given this XML
<well bulkShift="0.000000" diameter="5.000000" hidden="false" name="67-1-TpX-10" filename="67-1-TpX-10.well">
<metadata/>
<unit>ftUS</unit>
<colour blue="1.000000" green="1.000000" hue="" red="1.000000"/>
<tvd clip="false"/>
<associatedcheckshot>25-1-X-14</associatedcheckshot>
<associatedwelllog>HDRA_67-1-TpX-10</associatedwelllog>
<associatedwelllog>NPHI_67-1-TpX-10</associatedwelllog>
</well>
I can select the element with this XPath
//well[#bulkShift=0 and #diameter=5 and #hidden='false' and #name='67-1-TpX-10' and #filename='67-1-TpX-10.well']
However I need to be much more specific in that I need to find the element with these specific child nodes given that the child elements (metadata,unit,colour, etc) can appear in any order inside the element.
Ideally I'd like to be able to select this node with only a single XPath query.
Can anyone help?
This template match also childs and attributed on childs
<xsl:template match="well[#hidden='false'][./unit='ftUS' or ./tvd/#clip='false']">
well found!
</xsl:template>
or in one go:
<xsl:template match="well[#hidden='false' and (./unit='ftUS' or ./tvd/#clip='false')]">
well found!
</xsl:template>
You can add the test for children like the test for attributes to your predicate
e.g.:
//well[#bulkShift=0 and #diameter=5 and #hidden='false' and #name='67-1-TpX-10' and #filename='67-1-TpX-10.well']
[metadata and unit and colour]
Having a list off predicates [ predicate1 ][ predicate2 ] is the same as have one with and operation.

XPATH -- Result order defined by query

I have an xpath-expression like this:
element[#attr="a"] | element[#attr="b"] | element[#attr="c"] | … which is an »or« statement. So can I create an expression that guarantees the result to appear in the order as in the query, even if the elements appear in a different order in the document?
f.e. an document fragment in this order:
<doc>
<element attr="c" />
<element attr="b" />
<element attr="a" />
.
.
.
</doc>
and a result list ordered like this:
[0] <element attr="a" />
[1] <element attr="b" />
[2] <element attr="c" />
.
.
.
The | operator computes the union of its operands and with XPath 1.0 you simply get a set of nodes, the order is undefined, though most XPath APIs then return the result in document order or allow you to say which order you want or whether order matters (see for instance http://www.w3.org/TR/DOM-Level-3-XPath/xpath.html#XPathResult).
With XPath 2.0 you get a sequence of nodes ordered in document order, with XPath 2.0 if you want the order of your subexpressions you would need to use the comma operator, not the union operator i.e. element[#attr="a"] , element[#attr="b"] , element[#attr="c"].
can I create an expression that guarantees the result to appear in the
order as in the query, even if the elements appear in a different
order in the document?
Not with any XPath 1.0 engine -- they return the resulting XmlNodeList in document order.
With XPath 2.0 one can specify that a sequence is to be returned, using the comma , operator, like this:
element[#attr="a"] , element[#attr="b"] , element[#attr="c"]
Finally, If you are limited with an XPath 1.0 implementation, one way of getting the results in the desired order is to evaluate these three XPath expressions:
element[#attr="a"]
element[#attr="b"]
element[#attr="c"]
Then you can access the first result first, the second result -- second and the third result -- third.

How to select node which has a parent with some attributes

How to select node which has a parent with some attributes.
Eg: what is Xpath to select all expiration_time elements.
In the following XML, I'm getting error if states elements has attributes, otherwise no probs.
Thanks
<lifecycle>
<states elem="0">
<expiration_time at="rib" zing="chack">08</expiration_time>
</states>
<states elem="1">
<expiration_time at="but">4:52</expiration_time>
</states>
<states elem="2">
<expiration_time at="ute">05:40:15</expiration_time>
</states>
<states elem="3">
<expiration_time>00:00:00</expiration_time>
</states>
</lifecycle>
states/expiration_time[../#elem = "0"]?
Use:
/*/*/expiration_time
This selects all expiration_time elements that are grand-children of the top-element of the XML document.
/*/*[#*]/expiration_time
This selects any expiration_time element whose parent has at least one attribute and is a child of the top element of the XML document.
/*/*[not(#*)]/expiration_time
This selects any expiration_time element whose parent has no attributes and is a child of the top element of the XML document.
/*/*[#elem = '2']/expiration_time
This selects any expiration_time element whose parent has an elem attribute with string value '2' and that is (the parent) a child of the top element of the XML document.
This will give you all nodes having atleast one attribute
//*[count(./#*) > 0]

Resources