How to search for following-sibling and their decendents? - xpath

This will give all the siblings following the current node.
./following-sibling::*[somecondition]
This is will give all the descendants of those siblings.
./following-sibling::*//*[somecondition]
Is there a way to combine the two better than the following?
./following-sibling::*[somecondition] | ./following-sibling::*//*[somecondition]
xpath using // and descendant-or-self and self
The above page has nothing to do with my question. I am not asking self and its descendants.

you should be good to use following::* instead of following-sibling::*. Below is the example.
following::* checked both the siblings and the children under siblings.

Related

Xpath child of multiple types

I have this xpath:
.//*[#id='some_id']/td//div
and now I want to select any child of the div that is of certain type, for example every child that is either a label or span. Something like this
.//*[#id='some_id']/td//div/(label|span)/.......
but that is not valid xpath. How can I do that (wthout writing two full xpaths for the given 2 example for child types)
descendant:: finds on all level below, to find only children use
.//*[#id='some_id']/td//div/*[self::label or self::span]
you need to use
descendant::
to select child elements of particular element. look at the below example,
.//*[#id='some_id']/td//div/descendant::label[#class='some-class']
the above xpath will get all label with class "some-class" which is actually the child of ".//*[#id='some_id']/td//div/" element.
to find multiple child elements then use below xpath,
.//*[#id='some_id']/td//div/descendant::*[local-name()='label' or local-name='span']

XPATH - cannot select grandparent node

I am trying to parse a live betting XML feed and need to grab each bet from within the code. In plain English I need to use the tag 'EventSelections' for my base query and 'loop' through these tags on the XML so I grab all that data and it creates and entity for each one which I can use on a CMS.
My problem is I want to go up two places in the tree to a grandparent node to gather that info. Each EventID refers to the unique name of a game and some games have more bets than others. It's important that I grab each bet AND the EventID associated with it, problem is, this ID is the grandparent each time. Example:
<Sportsbet Time="2013-08-03T08:38:01.6859354+09:30">
<Competition CompetitionID="18" CompetitionName="Baseball">
<Round RoundID="2549" RoundName="Major League Baseball">
<Event EventID="849849" EventName="Los Angeles Dodgers (H Ryu) At Chicago Cubs (T Wood)" Venue="" EventDate="2013-08-03T05:35:00" Group="MTCH">
<Market Type="Match Betting - BIR" EachWayPlaces="0">
<EventSelections BetSelectionID="75989549" EventSelectionName="Los Angeles Dodgers">
<Bet Odds="1.00" Line=""/>
</EventSelections>
<EventSelections BetSelectionID="75989551" EventSelectionName="Chicago Cubs">
<Bet Odds="17.00" Line=""/>
</EventSelections>
Does anyone know how I can grab the granparent tags as well?
Currently I am using:
//EventSelections (this is the context)
.//#BetSelectionID
.//#EventSelectionName
I have tried dozens of different ways to do this including the ../.. operator which won't work either. I'd be eternally grateful for any help on this. Thanks.
I think you just haven't gone far enough up the tree.
../* is a two-step location bath with abbreviations, expanded to parent::node()/child::* ... so in effect you are going up the tree with the first step, but back down the tree for the second step.
Therefore, ../* gives you your siblings (parent's children), ../../* gives you your aunts and uncles (grandparent's children), and ../../../* gives you your grandparent and its siblings (great-grandparent's children).
For attributes, ../#* is an abbreviation for parent::node()/attribute::* and attributes are attached to elements, they are not considered children. So you are going sideways, not down the tree in the second step.
Therefore, unlike above, ../#* gives you your parent's attributes, while ../../#* gives you your grandparent's attributes.
But using // in your situation is really inappropriate. // is an abbreviation for /descendent-or-self::node()/ which walks all the way down a tree to the leaves of the tree. It should be used only in rare occasions (and I cringe when I see it abused on SO questions).
So ..//..//..//#RoundID may work for you, but it is in effect addressing attributes all over the tree and not just an attribute of your great-grandparent, which is why it is finding the attribute of your grandparent. ../../#RoundID should be all you need to get the attribute of your grandparent.
If you torture a stylesheet long enough, it will eventually work for you, but it really is more robust and likely faster executing to address things properly.
You could go with ancestor::Event/#EventID, which does exactly you asked for: matches an ancestor element named Event and returns it's EventID attribute.

XPath: How to select node with some attribute by index?

I have several nodes with some particular attribute and I need to select one of them by index. For example I need to select second <div> with 'test' class - //div[#class='test'][2] doesn't work.
Is there a way to select node with some attribute by index ? How to do it?
This is a FAQ.
In XPath the [] operator has a higher precedence (binds stronger) than the // pseudo-operator.
Because of this, the expression:
//div[#class='test'][2]
selects all div elements whose class attribute is "test" and who (the div elements) are the second such div child of their parent. This is not what you want.
Use:
(//div[#class='test'])[2]
I believe per XML specification, attributes are not considered to have an order.
Note that the order of attribute specifications in a start-tag or empty-element tag is not significant.
See here
I think you'd be best of re-factoring your structure such that attribute order does not describe anything. If you can give any more details we might be able to offer suggestions.
EDIT: Re-reading your post, looks like you are trying to find node order and not attribute order. Node order is allowed and your syntax looks OK off-hand. What software are you doing this in?

xpath return all non-blank text nodes not descendant of `a`, `style` or `script`

What expression would select all text nodes which are:
not blank
not inside a, or script or style?
Use:
//*[not(self::a or self::script or self::style)]/text()[normalize-space()]
Not only is this expression shorter than the one in the currently accepted answer, but it also may be much more efficient.
Do note that the expression doesnt use any (back/up)-ward axes at all.
This should do, assuming "not inside" means the text node is not supposed to be a descendant of an "a" or "script" or "style" element. If "not inside" only means not supposed to be a child then use parent::a and so on instead of ancestor::a.
//text()[normalize-space() and not(ancestor::a | ancestor::script | ancestor::style)]
I used Dimitre Novatchev's answer, but then i stumbled upon the problem described by the topic starter:
not descendant of a, style or script
Dimitre's answer excludes style tag but includes its children.
This version excludes also style, script, noscript tags and their descendants:
//div[#id='???']//*[not(ancestor-or-self::script or ancestor-or-self::noscript or ancestor-or-self::style)]/text()
Anyway, thanks to Dimitre Novatchev.

Modify XPath to return second of two values

I have an XPath that returns two items. I want to modify it so that it returns only the second, or the last if there are more than 2.
//a[#rel='next']
I tried
//a[#rel='next'][2]
but that doesn't return anything at all. How can I rewrite the xpath so I get only the 2nd link?
Found the answer in
XPATH : finding an attribute node (and only one)
In my case the right XPath would be
(//a[#rel='next'])[last()]
EDIT (by Tomalak) - Explanation:
This selects all a[#rel='next'] nodes, and takes the last of the entire set:
(//a[#rel='next'])[last()]
This selects all a[#rel='next'] nodes that are the respective last a[#rel='next'] of the parent context each of them is in:
//a[#rel='next'][last()] equivalent: //a[#rel='next' and position()=last()]
This selects all a[#rel='next'] nodes that are the second a[#rel='next'] of the parent context each of them is in (in your case, each parent context had only one a[#rel='next'], that's why you did not get anything back):
//a[#rel='next'][2] equivalent: //a[#rel='next' and position()=2]
For the sake of completeness: This selects all a nodes that are the last of the parent context each of them is in, and of them only those that have #rel='next' (XPath predicates are applied from left to right!):
//a[last()][#rel='next'] NOT equiv!: //a[position()=last() and #rel='next']

Resources