How to find xpath of an element under a heading - xpath

in a Web page :
<h3 class="xh-highlight">Units Currently On Bed List</h3>
"[total beds=0]
"
i want to find xpath of total beds=0.
how can i do?

Your question and your comment are a bit contradictory. Do you want to find the text after a heading or do you want to find the element containing the text [total beds=0]? Also, how exact do you want to navigate your document?
To find a text after any h3 element you can use this: //h3/following-sibling::text()[1] (see XPath - select text after certain node).
To find a text after an h3 element with the class "xs-highlight" you can use this: //h3[#class='xh-highlight']/following-sibling::text()[1]
To be even more precise you can also look for the heading text: //h3[#class='xh-highlight' and text()='Units Currently On Bed List']/following-sibling::text()[1]
This doesn't match the html in your first comment however, so you might want to adjust the header class and text values. Also, it will find any first text even if there are other elements between it and the h3 element.
Now, your second comment makes it seem you actually want to find the element containing the text. The reason //*[text()='[total beds=0]'] doesn't work is because of the newline in the text. If you can get rid of that in the source it should match, otherwise you can "ignore" it in the xpath by using //*[normalize-space(text())='[total beds=0]']. (This is assuming the quotes around the text in your question aren't actually in the document.)

Related

How xpath works for tags in tags

I am trying to find out the xpath for first name of the facebook page and I have ended it with the following xpath: "**//div[1]/div[1]/div[1]/div[1]/input[#class='inputtext _58mg _5dba _2ph-']**" which is correct. My question is that, there are total 9 div tags on the page but I got it with the fourth div, I am not getting the reason how it's finding it in fourth div?
Page is Facebook home Page and element to find with xpath is Fist name input box
Please help me to understand how it's finding the element using above xpath
I know there are other ways to find xpath but I want to know the reason how it's finding it
I hope I am providing the complete information for the asked question if not let me know
Well it's because your xpath starts with a //. In literal english, it says find a DIV whose child is a DIV whose child is a DIV whose child is a DIV whose child is your INPUT. In your case, it does find a DIV which has INPUT as described by your xpath.
If you replace that // to single /, it will find the first DIV and then will try finding your input. Which it won't be able to find since .. like you said there are 9 DIVs.
Hope that paints a picture. Let me know if you need more explanation.

Xpath syntax to grab listed elements based on ID above containing word

I want to grab li element text and links from a list. The challenge is, the span sometimes has different class names BUT always has the word 'notable' featured in them, example:
<span class="mw-headline" id="Notable_alumni">Notable alumni</span>
OR
<span class="mw-headline" id="Notable_former_pupils">Notable former pupils</span>
So I need to use "contains" somehow, so I am along these lines:
//li[contains(span/#id,'Notable')]/span/#id/following-sibling::text()
But can't get this right.
Another issue is these blocks of text and headers are not in the same containing div either. Added an image to simplify and you can see the code.
Assuming that the span with the #id is always under the h2 (you could make more generic by using * instead of h2 if that doesn't hold true). If you anchor to that containing element, then look for the first ul that is a following-sibling, you can select the text() from all of it's li elements:
//h2[span[contains(#id,'Movie Title')]]/following-sibling::ul[1]/li//text()

xpath - element containing exact text, but minus sibling elements?

Without using index specificity. I'm trying to target an element with exact text, but which also ignores the text of sibling elements. For example, target the span with Save below.
<span>Click and save money!</span>
<span>
<i>Icon</i>
Save
</span>
So something like //span[contains(text(), 'Save')] would grab any span with "Save" in it.
Try the xpath : //span[text()[normalize-space(.)='Save']]
It looks for span elements which have text nodes whose space-trimmed value is exactly Save

how to select the second <p> element using Xpath

I am trying to scrape full reviews from this webpage. (Full reviews - after clicking the 'Read More' button). This I am doing using RSelenium. I am able to select and extract text from the first <p> element, using the code
reviewNodes <- mybrowser$findElements(using = 'xpath', "//p[#id][1]")
which is for less text review.
But not able to extract full text reviews using the code
reviewNodes <- mybrowser$findElements(using = 'xpath', "//p[#id][2]")
or
reviewNodes <- mybrowser$findElements(using = 'xpath', "//p[#itemprop = 'reviewBody']")
It shows blank list elements. I don't know what is wrong. Please help me..
Drop the double slash and try to use the explicit descendant axis:
/descendant::p[#id][2]
(see the note from W3C document on XPath I mentioned in this answer)
As you're dealing with a list, you should first find the list items, e.g. using CSS selector
div.srm
Based on these elements, you can then search on inside the list items, e.g. using CSS selector
p[itemprop='reviewBody']
Of course you can also do it in 1 single expression, but that is not quite as neat imho:
div.srm p[itemprop='reviewBody']
Or in XPath (which I wouldn't recommend):
//div[#class='srm']//p[#itemprop='reviewBody']
If neither of these work for you, then the problem must be somewhere else.

Get element name by containing text

I'm looking through HTML documents for the text: "Required". What I need to find is the element that holds the text. For example:
<p>... Required<p>
I would get to element name = p
However, it might not be in a <p> tag. It could be in any kind of tag, which is where this question differs from some of the other search text Stack Overflow questions.
Right now I'm using:
page.at(':contains("Required")')
but this only get me the full HTML element
The problem you have is the :contains pseudo class matches any element that has the searched for text anywhere in its descendants. You need to find the innermost element that contains such text. Since html is the ancestor of all elements, if the page contains the text anywhere then html will contain, and so that will be the first matching element.
I’m not sure you can achieve this with CSS, but you can use XPath like this:
page.at_xpath('//*[text()[contains(., "Required")]]')
This finds the first element node that has a text() node as a child that contains Required. When you have that node (if it exists) you can then call name on it to give the name of the element.
For CSS you can do:
page.at('[text()*="Required"]')
It's not real CSS though, or even a jQuery extra.
You should use CSS selectors:
page.css('p').text

Resources