XPath: Find first occurance in children and siblings - xpath

So I have some HTML that looks like thus:
<tr class="a">
<td>...</td>
<td>...</td>
</tr>
<tr>
<td>....</td>
<td class="b">A</td>
</tr>
<tr>....</tr>
<tr class="a">
<td class="b">B</td>
<td>....</td>
</tr>
<tr>
<td class="b">Not this</td>
<td>....</td>
</tr>
I'm basically wanting to find the first instance of td class b following a tr with a class of a. Problem comes about is that it could be in either a child of the tr or in the next tr after it.
I can get the second case with:
//tr[#class="a"]//td[#class="b"]
But that misses the first case, because the TD is in a sibling, not a direct descendant. Ideas?

For the 2nd case (td is direct descendant of tr) :
//tr[#class="a"]//td[#class="b"][1]
For the 1st case (td is following tr), that isn't fall in the the 2nd case category :
//tr[#class="a" and not(.//td[#class="b"])]/following::td[#class="b"][1]
Combining the two xpath queries together using union operator (|) yield the expected output :
//tr[#class="a"]//td[#class="b"][1] | //tr[#class="a" and not(.//td[#class="b"])]/following::td[#class="b"][1]
output :
Element='<td class="b">A</td>'
Element='<td class="b">B</td>'

Related

Xpath: Wildcards for descendant nodes not working

Desired output: 3333
<tbody>
<tr>
<td class="name">
<p class="desc">Intel</p>
</td>
</tr>
Other tr tags
<tr>
<td class="tel">
<p class="desc">3333</p>
</td>
</tr>
</tbody>
I want to select the last tr tag after the tr tag that has "Intel" in the p tag
//tbody//tr[td[p[contains(text(),'Intel')]]]/followingsibling::tr[position()=last()]//p/text()
The above works but I don't wish to reference td and p explicitly. I tried wildcards ? or *, but it doesn't work.
//tbody//tr[?[?[contains(text(),'Intel')]]]/followingsibling::tr[position()=last()]//p/text()
"...which contains a text node equal to 'Intel'"
//tbody/tr[.//text() = 'Intel']/following-sibling::tr[last()]/td/p/text()
"...which contains only the string 'Intel', once you remove all insignificant white-space"
//tbody/tr[normalize-space() = 'Intel']/following-sibling::tr[last()]/td/p/text()
I think the key take-away here is that you can use descendant paths (//) and pay attention to context in predicates once you make them relative (.//).

Xpath to select next parent of the current node

if tr contains class="productnamecolor colors_productname" i want to select next tr which contains the price details. so i use :
.//a[#class="productnamecolor colors_productname"]/parent::node()/following-sibling::tr
But didn't work. What is wrong with this expression?
HTML :
<tr>
<td valign="top" width="100%">
Trouser Suspenders
</td>
</tr>
thanx in advance.
The parent of your <a> element is a td element, and the td element doesn't have a following-sibling - certainly not a following sibling that is a tr. If you want the next row in the table, use
.//a[#class="..."]/ancestor::tr[1]/following-sibling::tr[1]
or
.//tr[descendant::a/#class="..."]/following-sibling::tr[1]
If you want to select just next tr after <a class="productnamecolor colors_productname"> simply use following two ways :-
using following axis :
(.//a[#class="productnamecolor colors_productname"]/following::tr)[1]
using preceding axis :
(.//tr[preceding::a[#class="productnamecolor colors_productname"])[1]
Hope it helps...:)

Finding first matching sibling element while traversing the DOM

I am trying to create an xpath expression that will find the first matching sibling 'down' the dom given an initial sibling (note: initial siblings will be Tom and Steve). For example, I want to find 'jerry1' under the 'Tom' tr. I have looked into the following-sibling argument, but I'm not sure that's the best approach for this? Any ideas?
<tr>
<a title=”Tom”/>
</tr>
<tr>
<a title=”jerry1”/>
</tr>
<tr>
<a title=”jerry2”/>
</tr>
<tr>
<a title=”jerry3”/>
</tr>
<tr>
<a title=”Steve”/>
</tr>
<tr>
<a title=”jerry1”/>
</tr>
<tr>
<a title=”jerry2”/>
</tr>
<tr>
<a title=”jerry3”/>
</tr>
following-sibling will work. This will select the a node with the title "jerry1":
//a[#title='Tom']/../following-sibling::tr/a
The /.. traverses up to Tom's parent <tr>, then following-sibling to the next <tr>, then finally the <a> node within that.
Following XPath worked for me:
(//a[#title='Tom']/parent::*/following-sibling::tr/a[#title= 'jerry1'])[1]
First matching a with title jerry1 following a tr with an a-child with title Tom.
Starting at a[#title='Tom'], going to the parent tr with /parent , selecting all following sibling tr-nodes with ::*/following-sibling::tr, that have an /a[#title= 'jerry1'] as child node. Because this would select 2 jerry1-nodes and the first jerry1 following Tom is searched, selecting the first one by wrapping the XPath with () and choosing the first match with [1].
The following XPath statement finds the first tr element that has an a with the #title "jerry1" that is a following-sibling of the tr element that has an a with the #title of "Tom"
//tr[a/#title='Tom']/following-sibling::tr[a/#title='jerry1'][1]

xpath check if node has not specific children

I have an HTML file like this:
<tr>
<td class= 'iconmenu' width="100%">...</td>
</tr>
<tr>
<td class= 'iconmenu' width="100%">...</td>
<td class= 'iconmenu'>...</td>
</tr>
The first element has one child and the second has two child, the question is:
How can I check if the first element has one child?
Counting the number of child elements of the first tr:
count(/*/tr[1]/*)
Counting only td children:
count(/*/tr[1]/td)
Perform the comparison like this:
<xsl:if test="count(/*/tr[1]/td)=1">
Note the use of an absolute path (starting with /). You may be tempted to do count(//tr[1]/td). Note that this returns the count of all tds across every row that is the first row in every table in the document.
<xsl:for-each select="tr">
<xsl:variable name="count" select="count(td)"/>
</xsl:for-each>
</xsl:template>
let $s := <test><table>
<tr>
<td class= "iconmenu" width="100%">...</td>
</tr>
<tr>
<td class="iconmenu" width="100%">...</td>
<td class="iconmenu">...</td>
</tr>
</table></test>
return fn:count($s//table/tr[1]/td)
Please ref: http://www.xqueryfunctions.com/xq/c0015.html#c0016

xpath expression to find url and data

i want to get the values of every table and the href value for every within the table given below.
Being new to xpath, i am finding it difficult to write xpath expression.
However understanding what an xpath expression does lies somewhat in an easier category.
the expected output
http://a.com/ data for a 526735 Z
http://b.com/ data for b 522273 Z
http://c.com/ data for c 513335 Z
<table class = dataTabe>
<tbody>
<tr>
<td>data for a</td>
<td class="numericalColumn">526735</td>
<td class="numericalColumn">Z</td></tr>
<tr>
<td>data for b</td>
<td class="numericalColumn">522273</td>
<td class="numericalColumn">B</td></tr>
<tr>
<td>data for c</td>
<td class="numericalColumn">513335</td>
<td class="numericalColumn">B</td></tr>
</tbody>
</table>
You'll need two things: an XPath query which locates the wanted nodes and a second which outputs the text as you want it. Since you don't give more information about the languages you're using I'm putting together some pseudocode:
foreach node in document.select("//table[class='dataTable']//tr[td/a/#HREF]")
write node.select("concat(td/a/#HREF,' ',.)")
This site has a great free tool for building XPath Expressions (XPath Builder):
http://www.bubasoft.net/
Use this XPath: //tr/td/a/#HREF | //tr//text()

Resources