XPath to select all except common element - xpath

Say I have a set of XML messages, each with a common header but different types otherwise:
<MessageType details="test">
<MessageHeader>
<HeaderContent>content</HeaderContent>
</MessageHeader>
<OtherStuff detail="test"/>
<MessageContent>
<MessageText>text</MessageText>
</MessageContent>
</MessageType>
<MessageType2 details="test2">
<MessageHeader>
<HeaderContent>content2</HeaderContent>
</MessageHeader>
<MessageContent2>
<MessageText2>text2</MessageText2>
</MessageContent2>
</MessageType2>
<MessageType3 details="test3">
<MessageHeader>
<HeaderContent>content3</HeaderContent>
</MessageHeader>
<OtherStuff3 detail="test3"/>
<MoreStuff3 detail="test3"/>
<MessageContent3>
<MessageText3>text3</MessageText3>
<AnotherElement><Test text="">text3</Test></AnotherElement>
</MessageContent3>
</MessageType3>
I need an xpath that will select everything except MessageHeader and the root element (Because it contains MessageHeader).
//*[not(self::MessageHeader)] will select everything except MessageHeader, but it also selects the root element which I don't want it to do.
I would also settle for something that selects all siblings of MessageHeader, because I think that basically does the same thing in my case

I would also settle for something that selects all siblings of MessageHeader
That would be /*/MessageHeader/following-sibling::*
But you could also do (in XPath 3.0) tail(/*/*) or in XPath 1.0 (/*/*)[position ()!= 1].
Note: if you use //*, you're selecting elements deeper in the tree, including for example HeaderContent. I think you're probably better off selecting only level-2 elements (children of the outermost element), because you can always navigate downwards from those if you need to. Unlike //*, /*/* only selects level-2 elements.

Another option for " xpath that will select everything except MessageHeader and the root element" :
//*[not(count(ancestor::*)=0) and not(self::MessageHeader)]

Related

XPath select element if other not exist

In some case we need to select input[#class="some"] element, but only if div[#class="other"] is not exist.
Both elements doesn't have common parent, except body, of course.
As soon as results of our environment focused on XPath we need only XPath solution.
UPD: if element exists nothing should be returned
Try (not tested):
input[#class="some"][not(//div[#class="other"])]

Xpath to go back to sibing td

I am trying to back to to previous td but to no avail, can you help
//*[#class='ein' and contains(.,'aaaa')] gets me to td but need to select the previous td-tried below but did not work
//*[#class='ein' and contains(.,'aaaa')][preceding-sibling::td]
Remember /X means "select X", while [X] means "where X". If you want to select preceding siblings, rather than testing whether they exist, use /.
It's impossible to say for certain without seeing the input HTML but I suspect that instead of
//*[#class='ein' and contains(.,'aaaa')][preceding-sibling::td]
you need something like
//*[#class='ein' and contains(.,'aaaa')]/preceding-sibling::td[1]
to navigate from each node selected by the initial expression to its nearest preceding td. Your first attempt will select exactly the same nodes as
//*[#class='ein' and contains(.,'aaaa')]
but only if they have at least one preceding-sibling element named td.
Use // after the element you found
Instead of preceding-sibling, just use preceding
//*[#class='ein' and contains(.,'aaaa')]//preceding::td[1]

XPath: How do I select text() or a span element within a parent element

I have a parent element (font) and I would like to select all the child elements (direct descendants) that are either text() or span elements. How would I construct such an xpath?
If the current node is the font element, then something like this:
text()|span
otherwise you have to always combine with | the two complete XPath - the one for text and the one for span, e.g.:
font/text()|font/span
if the current node is just above font - or
//a[text()='View Larger Map']/../../../../div[contains(#class, 'paragraph')][3]/font/span|//a[text()='View Larger Map']/../../../../div[contains(#class, 'paragraph')][3]/font/text()
if starting from the root with some complex selection criteria.
If you have complex paths like the last one probably it is better to store a partial one in a variable - e.g. inside an XSLT:
<xsl:variable name="font" select="//a[text()='View Larger Map']/../../../../div[contains(#class, 'paragraph')][3]/font"/>
. . .
<xsl:for-each select="$font/span|$font/text()">
. . .
</xsl:for-each>
Another possibility is to do something like this:
//a[text()='View Larger Map']/../../../../div[contains(#class, 'paragraph')][3]/font/node()[name()='span' or name()='']
that works because name() returns an empty string for text() nodes - but I am not 100% sure that it works that way for all XPath processors, and it could match by mistake comment nodes.

How to make one RxPath from two

I have those two RxPaths which I need to be written in one expresion:
/td[2]/a[1]/tag[1]
and
/td[2]/a[1]
So basically I need to select path with 'tag' element if exists, if not than to select 'a' element.
something like:
if exist /td[2]/a[1]/tag[1] select /td[2]/a[1]/tag[1]
else select /td[2]/a[1]
Those elements need to have innertext attribute with some value in them, so I tried:
/td[2]/descendant::node()[#innertext!='']
but it won't work...
Also those elements are at the bottom of hierarchy so if is there any way to just select first element at lowest level.
I managed to solve this with an regex at the end of my Xpath expression.
/dom/body/div[#id='isc_0']/div/div[#id='isc_B']/div[#id='isc_C']/div[#id='isc_10']/div/div/iframe/body/table/tbody/tr/td[1]/a[#innertext='any uri item']/../../td[2]/*[#innertext~'[^ ]+']
Sorry for misunderstanding with problem...
Regards,
Vajda Vladimir
So basically I need to select path
with 'tag' element if exists, if not
than to select 'a' element. something
like:
if exist
/td[2]/a[1]/tag[1]
select
/td[2]/a[1]/tag[1]
else select
/td[2]/a[1]
I highly doubt that the top element of the document is a td. Don't use /td -- it means you want to select the top element of the document and this top element must be a td .
Also, /td[2] never selects anything, because a (wellformed) XML document has exactly one top element.
Use:
someParentElement/td[2]/a[1]/tag[1]
|
someParentElement/td[2]/a[1][not(someParentElement/td[2]/a[1]/tag[1])]
Those elements need to have innertext
attribute with some value in them
Use:
someParentElement/td[2][.//#innertext[normalize-space()]]/a[1]/tag[1]
|
someParentElement/td[2]
[.//#innertext[normalize-space()]]/a[1]
[not(someParentElement/td[2]
[.//#innertext[normalize-space()]]/a[1]/tag[1])]
Also those elements are at the bottom
of hierarchy so if is there any way to
just select first element at lowest
level.
This is not clear. Please, clarify.
All "leaf" elements can be selected using the following XPath expression:
//*[not(*)]
The elements selected don't have any children-elements, but may have other children (such as text-nodes, PIs, comments) and attributes.
Besides all those good advices from #Dimitre, I want to add that a parent will always come before (in document order) than a child, so you could use this XPath expression:
(/real-path-from-root/td[2]/a[1]
|
/real-path-from-root/td[2]/a[1]/tag[1])[last()]
You could do this without | union set operator in XPath 1.0, but it will end up very unreadable... Of course, in XPath 2.0 you could just do:
(/real-path-from-root/td[2]/a[1]/(.|tag[1]))[last()]

Modify XPath to return second of two values

I have an XPath that returns two items. I want to modify it so that it returns only the second, or the last if there are more than 2.
//a[#rel='next']
I tried
//a[#rel='next'][2]
but that doesn't return anything at all. How can I rewrite the xpath so I get only the 2nd link?
Found the answer in
XPATH : finding an attribute node (and only one)
In my case the right XPath would be
(//a[#rel='next'])[last()]
EDIT (by Tomalak) - Explanation:
This selects all a[#rel='next'] nodes, and takes the last of the entire set:
(//a[#rel='next'])[last()]
This selects all a[#rel='next'] nodes that are the respective last a[#rel='next'] of the parent context each of them is in:
//a[#rel='next'][last()] equivalent: //a[#rel='next' and position()=last()]
This selects all a[#rel='next'] nodes that are the second a[#rel='next'] of the parent context each of them is in (in your case, each parent context had only one a[#rel='next'], that's why you did not get anything back):
//a[#rel='next'][2] equivalent: //a[#rel='next' and position()=2]
For the sake of completeness: This selects all a nodes that are the last of the parent context each of them is in, and of them only those that have #rel='next' (XPath predicates are applied from left to right!):
//a[last()][#rel='next'] NOT equiv!: //a[position()=last() and #rel='next']

Resources