xpath - element containing exact text, but minus sibling elements? - xpath

Without using index specificity. I'm trying to target an element with exact text, but which also ignores the text of sibling elements. For example, target the span with Save below.
<span>Click and save money!</span>
<span>
<i>Icon</i>
Save
</span>
So something like //span[contains(text(), 'Save')] would grab any span with "Save" in it.

Try the xpath : //span[text()[normalize-space(.)='Save']]
It looks for span elements which have text nodes whose space-trimmed value is exactly Save

Related

XPATH - how to get the text if an element contains a certain class

JHow do I grab this text here?
I am trying to grab the text here based on that the href contains "#faq-default".
I tried this first of all but it doesn't grab the text, only the actual href name, which is pointless:
//a/#href[contains(., '#faq-default-2')]
There will be many of these hrefs, such as default-2, default-3 so I need to do some kind of contains query, I'd guess?
You are selecting the #href node value instead of the a element value. So try this instead:
//a[contains(#href, '#faq-default-2')]

How to find xpath of an element under a heading

in a Web page :
<h3 class="xh-highlight">Units Currently On Bed List</h3>
"[total beds=0]
"
i want to find xpath of total beds=0.
how can i do?
Your question and your comment are a bit contradictory. Do you want to find the text after a heading or do you want to find the element containing the text [total beds=0]? Also, how exact do you want to navigate your document?
To find a text after any h3 element you can use this: //h3/following-sibling::text()[1] (see XPath - select text after certain node).
To find a text after an h3 element with the class "xs-highlight" you can use this: //h3[#class='xh-highlight']/following-sibling::text()[1]
To be even more precise you can also look for the heading text: //h3[#class='xh-highlight' and text()='Units Currently On Bed List']/following-sibling::text()[1]
This doesn't match the html in your first comment however, so you might want to adjust the header class and text values. Also, it will find any first text even if there are other elements between it and the h3 element.
Now, your second comment makes it seem you actually want to find the element containing the text. The reason //*[text()='[total beds=0]'] doesn't work is because of the newline in the text. If you can get rid of that in the source it should match, otherwise you can "ignore" it in the xpath by using //*[normalize-space(text())='[total beds=0]']. (This is assuming the quotes around the text in your question aren't actually in the document.)

Xpath syntax to grab listed elements based on ID above containing word

I want to grab li element text and links from a list. The challenge is, the span sometimes has different class names BUT always has the word 'notable' featured in them, example:
<span class="mw-headline" id="Notable_alumni">Notable alumni</span>
OR
<span class="mw-headline" id="Notable_former_pupils">Notable former pupils</span>
So I need to use "contains" somehow, so I am along these lines:
//li[contains(span/#id,'Notable')]/span/#id/following-sibling::text()
But can't get this right.
Another issue is these blocks of text and headers are not in the same containing div either. Added an image to simplify and you can see the code.
Assuming that the span with the #id is always under the h2 (you could make more generic by using * instead of h2 if that doesn't hold true). If you anchor to that containing element, then look for the first ul that is a following-sibling, you can select the text() from all of it's li elements:
//h2[span[contains(#id,'Movie Title')]]/following-sibling::ul[1]/li//text()

Get element name by containing text

I'm looking through HTML documents for the text: "Required". What I need to find is the element that holds the text. For example:
<p>... Required<p>
I would get to element name = p
However, it might not be in a <p> tag. It could be in any kind of tag, which is where this question differs from some of the other search text Stack Overflow questions.
Right now I'm using:
page.at(':contains("Required")')
but this only get me the full HTML element
The problem you have is the :contains pseudo class matches any element that has the searched for text anywhere in its descendants. You need to find the innermost element that contains such text. Since html is the ancestor of all elements, if the page contains the text anywhere then html will contain, and so that will be the first matching element.
I’m not sure you can achieve this with CSS, but you can use XPath like this:
page.at_xpath('//*[text()[contains(., "Required")]]')
This finds the first element node that has a text() node as a child that contains Required. When you have that node (if it exists) you can then call name on it to give the name of the element.
For CSS you can do:
page.at('[text()*="Required"]')
It's not real CSS though, or even a jQuery extra.
You should use CSS selectors:
page.css('p').text

XPath intersection of two sets

I need to extract all links from a html document having text as the inner element and not a reference to an image. Basically I would like to do a doc.select("//a/attribute::href") for all elements in a tree where doc.select("//a/text()") returns anything. Thanks!
Well you can write conditions in XPath in a predicate in square brackets, e.g. //a[text()]/#href selects the href attributes of all link (a) elements that have at least one text node child. Or if you want to make sure there is no img child element in the link you can use e.g. //a[not(img)]/#href.

Resources