XPath expression to select text WITH link - xpath

I want XPath to return a text with a link embedded in it, can I return both at the same time or am I forced to return:
the link & separately
the text label of the link

You can do some thing like below.
//actor[#color='blue']/concat(#id,",",#class)

Related

How to get only div text not span text using xpath

I am trying to get only order numbers like "190795". check the image
What i have tried
self.driver.find_element_by_xpath('//div[#class="sc-kafWEX ihwrOP"]//div/div/div[3]').text
But it will return span text such as "Order number:190795" like this.
I want only "190795"
This is my HTMl code
That is a text node, you can not write an XPath (v1.0) for that, and Selenium make use of XPath v1.0 , so you will have to be dependent on binding language.
Try this:
org_text = self.driver.find_element_by_xpath('//div[#class="sc-kafWEX ihwrOP"]//div/div/div[3]').text
desired_text = org_text.split(':')[1]
print(desired_text)

how to select the second <p> element using Xpath

I am trying to scrape full reviews from this webpage. (Full reviews - after clicking the 'Read More' button). This I am doing using RSelenium. I am able to select and extract text from the first <p> element, using the code
reviewNodes <- mybrowser$findElements(using = 'xpath', "//p[#id][1]")
which is for less text review.
But not able to extract full text reviews using the code
reviewNodes <- mybrowser$findElements(using = 'xpath', "//p[#id][2]")
or
reviewNodes <- mybrowser$findElements(using = 'xpath', "//p[#itemprop = 'reviewBody']")
It shows blank list elements. I don't know what is wrong. Please help me..
Drop the double slash and try to use the explicit descendant axis:
/descendant::p[#id][2]
(see the note from W3C document on XPath I mentioned in this answer)
As you're dealing with a list, you should first find the list items, e.g. using CSS selector
div.srm
Based on these elements, you can then search on inside the list items, e.g. using CSS selector
p[itemprop='reviewBody']
Of course you can also do it in 1 single expression, but that is not quite as neat imho:
div.srm p[itemprop='reviewBody']
Or in XPath (which I wouldn't recommend):
//div[#class='srm']//p[#itemprop='reviewBody']
If neither of these work for you, then the problem must be somewhere else.

xpath - how to show the anchor text of a specified link

I would like to search for a link on a page by its domain name - possibly using contains()? And then only show the anchor text of that link.
I've been able to get all of the a tag using
//a[contains(text(), 'domain_name')]
but unable to retrieve just the anchor text. Can anybody help?
Just use the text() node:
//a[contains(#href, 'domain_name')]/text()

how to find link using find.byselector using text property of link

we have a link <> linktext <>
How to identify it using Find.BySelector and using text of link like we do in selenium....
//a [contains (text (),'linktext')]
To find a link by exact text:
myIE.Link(Find.ByText("my literal link text")).Click();
To find a link by "contains" text, one way is to do the same as above but have Find.ByText(new Regex......());

XPath Expression

I am new to XPath. I have a html source of the webpage
http://london.craigslist.co.uk/com/1233708939.html
Now I want to extract the following data from the above page
Full Date
Email - just below the date
I also want to find the existence of the button "Reply to this post" on the page
http://sfbay.craigslist.org/sfc/w4w/1391399758.html
Can anyone help me in writing the three XPath expressions for the above three data.
You don't need to write these yourself, or even figure them out yourself. If you use the Firebug plugin, go to the page, right click on the elements you want, click 'Inspect element' and Firebug will popup the HTML in a viewer at the bottom of your browser. Right click on the desired element in the HTML viewer and click on 'Copy XPath'.
That said, the XPath expression you're looking for (for #3) is:
/html/body/div[4]/form/button
...obtained via the method described above.
I noticed that the DTD is HTML 4/01 Transitional and not XHTML for the first link, so there's no guarantee that this is a valid XML document, and it may not be loaded correctly by an XML parser. In fact, I see several tags that aren't properly closed (i.e. <hr>, etc)
I don't know the first one off hand, and the third one was just answered by Alex, but the second one is /html/body/a[0].
As of your first page it's just impossible to do because this is not the way xpath works. In order for an xpath expression to select something that "something" must be a node (ie an element)
The second page is fairly easy, but you need an "id" attribute in order to do that (or anything that can make sure your button is unique). For example if you are sure the text "Reply to this post" correctly identify the button just do it with
//button["Reply to this post"]

Resources