Xpath to select only nodes where child elements exist? - xpath

This should be an easy one but it is giving me trouble. Given this structure:
<root>
<a>
<b/>
</a>
<a/>
</root>
I'm trying to formulate an xpath expression that gives only the non-empty "a" elements, i.e. the ones that have child elements. Therefore I want the first instance of "a" returned, but not the second.
So far I have "/root/a/self::*" but that is returning me both a's.

/root/a[count(*)>0]
will give any 'a' node with any kind of child node

/root/a[count(*)>0]

This one works
/root/a[*]
or even
//a[*]

Related

Using XPATH previous:: more like an array

I've got XML like this
<root>
...
<a>
<a>
<a>
<c>
...
It's very flat with LOTS of A elements and a few C elements. The A elements are sensor data and the last reading is bogus, I need the one before. So I'd like to use the C elements as a marker and each of A elements 2 before each C. So I'm trying out an XPATH like:
/root/c/preceding-sibling::a
but I'm getting all previous A elements, I was hoping for something a bit more direct such as:
/root/c/preceeding-sibling[-2]
which would just grab the 2nd sibling before C (no matter the type) I guess I'm asking for array like functionality on an XPATH so what ever I match I can ask for "the second element before that"
Is this possible?
You can
just grab the 2nd sibling before C (no matter the type)
with the XPath expression
/root/c/preceding-sibling::*[2]
The node count for preceding-sibling:: is going backwards. The node with the index [1] is the node before c and the node with the index [2] is the node before this - which is
the second element before that

using xpath need to get to a child and get another node one level up

I am trying traverse through an XML with XPath. I want to visit /group/isRequired[text()='Optional'] and travel one level up to grab the /bool node
I tried a few things like the below but can't seem to get it rit... appreciate any inputs.
I basically want to verify the Library node, group+isRequired node and the bool nodes in one statement.
//root/sample[library[text()='2']]/group/isRequired[text()='Optional']//bool[text()='true']
//root/sample[library[text()='2']]/group/isRequired[text()='Optional']../bool[text()='true']
//root/sample[library[text()='2']]/group/isRequired[text()='Optional']/bool[text()='true']
//root/sample[library[text()='2']]/group/isRequired[text()='Optional']/../bool[text()='true']
<root>
<sample>
<id>1</id>
<library>2</library>
<ruleName>Default</ruleName>
<group>
<groupID>1</groupID>
<groupName>orange</groupName>
<isRequired>Optional</isRequired>
</group>
<variant>1</variant>
<bool>true</bool>
</sample>
</root>
You need to move two steps up:
/root/sample[library[text()='2']]/group/isRequired[text()='Optional']/../../bool[text()='true']
But is much cleaner to put multiple conditions in one predicate:
/root/sample[library[text()='2'] and group/isRequired[text()='Optional'] and bool[text()='true']]
Simpler:
/root/sample[library = "2" and group/isRequired = "Optional" and bool = "true"]
You don't have to use /text() to get the value of every node in the XPath. Depending on whether you XML has a schema, you don't need to put the literal values in quotes. Without it, everything is a string value, so I put them in quotes just for safety.
You can go a different route, by filtering sample node by group/isRequired child, then you can continue from that sample node to get to the bool node :
//root/sample[library='2' and group/isRequired='Optional']/bool[.='true']

Compare attribute of one element to attribute of another element

This seems like it should be easy, but I can never figure it out.
Presume I have the following document:
<data>
<a>
<b val="1"/>
</a>
<c val="1">
</data>
And assume that I am executing an XPath from the context of <b>. I need to check if there is an element c that has the same value as b.
Obviously, this doesn't work:
../a/c[#val=#val]
How to I get an XPath to remember its "current" context when traversing the tree?
Try the expression below. You'll notice that the current node is not lost since a predicate is used for finding the c node.
.[../../c/#val=#val]

XPath: How do I select text() or a span element within a parent element

I have a parent element (font) and I would like to select all the child elements (direct descendants) that are either text() or span elements. How would I construct such an xpath?
If the current node is the font element, then something like this:
text()|span
otherwise you have to always combine with | the two complete XPath - the one for text and the one for span, e.g.:
font/text()|font/span
if the current node is just above font - or
//a[text()='View Larger Map']/../../../../div[contains(#class, 'paragraph')][3]/font/span|//a[text()='View Larger Map']/../../../../div[contains(#class, 'paragraph')][3]/font/text()
if starting from the root with some complex selection criteria.
If you have complex paths like the last one probably it is better to store a partial one in a variable - e.g. inside an XSLT:
<xsl:variable name="font" select="//a[text()='View Larger Map']/../../../../div[contains(#class, 'paragraph')][3]/font"/>
. . .
<xsl:for-each select="$font/span|$font/text()">
. . .
</xsl:for-each>
Another possibility is to do something like this:
//a[text()='View Larger Map']/../../../../div[contains(#class, 'paragraph')][3]/font/node()[name()='span' or name()='']
that works because name() returns an empty string for text() nodes - but I am not 100% sure that it works that way for all XPath processors, and it could match by mistake comment nodes.

Find attribute names that start with a certain pattern

I am looking to find all attributes of an element that match a certain pattern.
So for an element
<element s2="1" name="aaaa" id="1" />
<element s3="1" name="aaaa" id="2" />
I would like to be able to find all attributes that start with 's' (returning the value of s1 for the first element and s3 for the value of the second element).
If this is outside of xpath's ability please let me know.
Use:
element/#*[starts-with(name(), 's')]
This XPath expression selects all atribute nodes whose name starts with the string 's' and that are attributes of elements named element that are children of the current node.
starts-with() is a standard function in XPath 1.0
element/#*[substring(name(), 1,1) = "s"]
will match any attribute that starts with 's'.
The function starts-with() might look better than using substring()
I've tested the given answers from both #Dimitre-Novatchev and #Ledhund, using lxml.html module in Python.
Both element/#*[starts-with(name(), 's')] and element/#*[substring(name(), 1,1) = "s"] return only the values of s2 and s3. You won't be able to know which value belong to which attribute.
I think in practice I would be more interested in finding the elements themselves that contain the attributes of names starting with specific characters rather than just their values.
To achieve that is very simple, just add /.. at the end,
element/#*[starts-with(name(), "s")]/..
or
element/#*[starts-with(name(), "s")]/parent::*
or
element/#*[starts-with(name(), "s")]/parent::node()
None from above worked for me.
So I did not some changes and it worked for me. :)
/*:UserCustomField[starts-with(#name, 'purchaseDate')]

Resources