I've tried everything below, not working for me. I am trying to avoid using "contains"
//p[text()[contains(.,'text1')]][text()[contains(.,'text2')]]
Here is my html p element
<p>
text1
<br></br>
text2
</p>
Here's what I've tried so far:
"//p[normalize-space(text()) = 'text1text2']]"
"//p[normalize-space() = 'text1text2']]")
"//p[text()[normalize-space() ='text1text2']]"
"//p[text()[normalize-space().,'text1text2']]"
#"//p[text() = ""text1\r\ntext2""]"
I had to do this:
"//p[.='text1text2']"
But I would still like to see if I could verify the newline in my xpath somehow.
Related
Watir
mytext =browser.element(:xpath => '//*[#id="gold"]/div[1]/h1').text
Html
<h1>
This is the text I want
<span> I do not want this text </span>
</h1>
When I run my Watir code, it selects all the text, including what is in the spans. How do I just get the text "This is the text I want", and no span text?
If you have a more complicated HTML, I find it can be easier to deal with this using Nokogiri as it provides more methods for parsing the HTML:
require 'nokogiri'
h1 = browser.element(:xpath => '//*[#id="gold"]/div[1]/h1')
doc = Nokogiri::HTML.fragment(h1.html)
mytext = doc.at('h1').children.select(&:text?).map(&:text).join.strip
Ideally start by trying to avoid using XPath. One of the most powerful features of Watir is the ability to create complicated locators without XPath syntax.
The issue is that calling text on a node gets all content within that node. You'd need to do something like:
top_level = browser.element(id: 'gold')
h1_text = top_level.h1.text
span_text = top_level.h1.span.text
desired_text = h1_text.chomp(span_text)
This is useful for top level text.
If there is only one h1, you can ommit id
#b.h1.text.remove(#b.h1.children.collect(&:text).join(' '))
Or specify it if there are more
#b.h1(id: 'gold').text.remove(#b.h1.children.collect(&:text).join(' '))
Make it a method and call it from your script with get_top_text(#b.h1) to get it
def get_top_text(el)
el.text.chomp(#b.h1.children.collect(&:text).join(' '))
end
Here is some sample HTML
<div class="something">
<p> This is a <b> Paragraph </b> with mixed elements
<p> Next paragraph....
</div>
what I tried was
//div[contains('#class','something')/text()
and
//div[contains('#class','something')/*/text()
and
//div[contains('#class','something')/p/text()
all of these seem to skip the 'b' tags and the 'a' tags.
Try " ".join(sel.xpath("//div[contains(#class,'something')]//text()").extract()) where sel is selector in your case may be response.
Use the XPath expression
//div[contains(#class,'something')]//text()
to get a concatenation of the text of all the text() nodes in the chosen div element.
Output:
This is a Paragraph with mixed elements
Next paragraph....
It depends on what and how you want to obtain. Anyway, there are couple of problems with what you tried:
You are missing closing bracket (]) after contains in the XPath expression.
#class should not be enclosed in (single) quotes when used inside contains.
If you want to get all the text of div element as one string, you might use
normalize-space(//div[contains(#class,'something')])
My code is like this,
<div>
<strong> Text1: </strong>
1234
<br>
<strong> Text2: </strong>
5678
<br>
</div>
where numbers, 1234 and 5678 are generated dynamically. When I take XPath of Text2 : 5678, it gives me like /html/body/div[7]/div/div[2]/div/div[2]/div[2]/br[2]. This does not work for me. I need to take XPath of only "Text2 : 5678". any help will be appreciated. (I am using selenium webdriver and C# to code my test script)
I second #Anil's comment above. The text "Text2:" is retrievable as it is within "strong" element. But, "5678" comes under div and is not the innerHTML for either "strong" or "br".
Hence, to retrieve the text "Text 2: 5678", you'll have to retrieve the innerHTML/text of "div" and modify it accordingly to get the required text.
Below is a Java code snippet to retrieve the text:-
WebElement ele = driver.findElement(By.xpath("//div"));
System.out.print(ele.getText().split("\n")[1]; //Splitting using newline as the split string.
I hope you can formulate the above in C#.
How would one, via xpath, select the strong tag after baz text for example?
<p>
<br>foo<strong>this foo</strong>
<br>bar<strong>this bar</strong>
<br>baz<strong>this baz</strong>
<br>qux<strong>this qux</strong></p>
Obviously the following does not work....
//p[text() = 'baz']/following-sibling::select[1]
Try this
//p/text()[. = 'baz']/following-sibling::strong[1]
Demo here - http://www.xpathtester.com/obj/b67bad4d-4d38-4e2d-a3df-b7e5a2e9f286
This solution relies on no whitespace around your text nodes. You will need to switch to using the following if you start using indentation or other whitespace characters
//p/text()[normalize-space(.) = 'baz']/following-sibling::strong[1]
For example:
<p>
<b>Member Since:</b> Aug. 07, 2010<br><b>Time Played:</b> <span class="text_tooltip" title="Actual Time: 15.09:37:06">16 days</span><br><b>Last Game:</b>
<span class="text_tooltip" title="07/16/2011 23:41">1 minute ago</span>
<br><b>Wins:</b> 1,017<br><b>Losses / Quits:</b> 883 / 247<br><b>Frags / Deaths:</b> 26,955 / 42,553<br><b>Hits / Shots:</b> 690,695 / 4,229,566<br><b>Accuracy:</b> 16%<br>
</p>
I want to get 1,017. It is a text after the tag, containing text Wins:.
If I used regex, it would be [/<b>Wins:<\/b> ([^<]+)/,1], but how to do it with Nokogiri and XPath?
Or should I better parse this part of page with regex?
Here
doc = Nokogiri::HTML(html)
puts doc.at('b[text()="Wins:"]').next.text
You can use this XPath: //*[*/text() = 'Wins:']/text() It will return 1,017.
About regex: RegEx match open tags except XHTML self-contained tags
I would use pure XPath like:
"//b[.='Wins:']/following::node()[1]"
I've heard thousand of times (and from gurus) "never use regex to parse XML". Can you provide some "shocking" reference demonstrating that this sentence is not valid any more?
Use:
//*[. = 'Wins:']/following-sibling::node()[1]
In case this is ambiguous (selects more than one node), more strict expressions can be specified:
//*[. = 'Wins:']/following-sibling::node()[self::text()][1]
Or:
(//*[. = 'Wins:'])[1]/following-sibling::node()[1]
Or:
(//*[. = 'Wins:'])[1]/following-sibling::node()[self::text()][1]