Xpath simplification: extract text of self and child node - xpath

Having this HTML-snippet
<td class="info">self-text
<br>
<b>child-text</b>
</td>
I would like to extract self-text and child-text.
So far i am using this regex:
.//td[contains(#class, 'info')]/text() | .//td[contains(#class, 'info')]/b/text()
Is there any simpler way to do this?

You can use the following XPath expression which will return all non-empty text nodes anywhere within the outer td element :
.//td[contains(#class, 'info')]//text()[normalize-space()]

Related

Scrapy: How do I select the next `td` in this `tr`?

I want to select the next sibling of a td tag in a tr element.
The tr element is this:
<tr>
<td>Created On:</td>
<td>06/28/2018 06:32 </td>
</tr>
My Scrapy code looks like this: response.xpath("//text()[contains(.,'Created On:')]/following-sibling::td"). But that gives me an empty list [].
How do I select the next td?
Try this XPath expression:
//text()[contains(.,'Created On:')]/../following-sibling::td
You were trying to use the following-sibling axis from the wrong context node. Going back one level fixes this problem.
An alternative is matching the td element in the first place like in this expression:
//td[contains(text(),'Created On:')]/following-sibling::td

Xpath to select next parent of the current node

if tr contains class="productnamecolor colors_productname" i want to select next tr which contains the price details. so i use :
.//a[#class="productnamecolor colors_productname"]/parent::node()/following-sibling::tr
But didn't work. What is wrong with this expression?
HTML :
<tr>
<td valign="top" width="100%">
Trouser Suspenders
</td>
</tr>
thanx in advance.
The parent of your <a> element is a td element, and the td element doesn't have a following-sibling - certainly not a following sibling that is a tr. If you want the next row in the table, use
.//a[#class="..."]/ancestor::tr[1]/following-sibling::tr[1]
or
.//tr[descendant::a/#class="..."]/following-sibling::tr[1]
If you want to select just next tr after <a class="productnamecolor colors_productname"> simply use following two ways :-
using following axis :
(.//a[#class="productnamecolor colors_productname"]/following::tr)[1]
using preceding axis :
(.//tr[preceding::a[#class="productnamecolor colors_productname"])[1]
Hope it helps...:)

Selenium IDE with XPath to identify cell in table based on other column

Please take a look at the snippet of html below:
<tr class="clickable">
<td id="7b8ee8f9-b66f-4fba-83c1-4cf2827130b5" class="clickable">
<a class="editLink" href="#">Single</a>
</td>
<td class="clickable">£14.00</td>
</tr>
I'm trying to assert the value of td[2] when td[1] contains "Single". I've tried assorted variants of:
//td[2][(contains(text(),'£14.00'))]/../td[1][(contains(text(),'Single'))]
I've used similar notation elsewhere successfully - but to no avail here... I think it's down to td[1] having the nested element, but not sure.
Can someone enlighten as to what I'm getting wrong? :)
Cheers!
What about:
//tr[contains(td[1], "Single")]/td[2]
First select the <tr> containing the <td> matching the text, and then select td[2].
Then,
contains(//tr[contains(td[1], "Single")]/td[2], "£14.00")
should return True.
Or, closer to the expression you tried, you could test if this matches:
//tr[contains(td[1], "Single")]/td[2][contains(., "£14.00")]
See #JensErat's answer to find xth td with td contains in same tr xpath python .
Why not make it simple on yourself, do the if statement in your code. Psuedocode:
Select the top level tr.
Find first td within tr, check to see if it contains Single.
If it does, assert that it contains £14.00
Alternatively, you could just get the text of the top level tr and perform the checks on that text.

XPath - Locate node using its flattened descendant text

I got this html :
<tr>
<td>
Some
<strong>
text
</strong>
<em>
and more
</em>
</td>
...
</tr>
I need to locate my td element with this text Some text and more. I know that I can get this text with this XPath expression :
//td//text()
But I can not find a solution to locate td element. I try this :
//td[//text()='Some text and more']
but I get errors. Do you know a working XPath expression for that ?
Firstly, XPath uses forward slashes, never backslashes.
Secondly, I believe this may be the XPath you need:
//td[normalize-space(.) = 'Some text and more']
Could you give that a try?

xpath nearest element to a given element

I am having trouble returning an element using xpath.
I need to get the text from the 2nd TD from a large table.
<tr>
<td>
<label for="PropertyA">Some text here </label>
</td>
<td> TEXT!! </td>
</tr>
I'm able to find the label element, but then I'm having trouble selecting the sibling TD to return the text.
This is how I select the label:
"//label[#for='PropertyA']"
thanks
You are looking for the axes following-sibling. It searches in the siblings in the same parent - there it is tr. If the tds aren't in the same tr then they aren't found. If you want to it then you can use axes following.
//td[label[#for='PropertyA']]/following-sibling::td[1]
From the label element, it should be:
//label[#for='PropertyA']/following::td[1]
And then use the DOM method from the hosting language to get the string value.
Or select the text node (something I do not recommend) with:
//label[#for='PropertyA']/following::td[1]/text()
Or if there's going to be just this one only node, then you could use the string() function:
string(//label[#for='PropertyA']/following::td[1])
You can also select from the common ancestor tr like:
//tr[td/label/#for='PropertyA']/td[2]
Getting ANY following element:
//td[label[#for='PropertyA']]/following-sibling::*

Resources