How do I write an xpath expression for an element following another with certain text? - xpath

How do I construct an xpath expression in which I want to get the <p> element following a <p> element with a child <strong> element with the text, "About this event:".
In other words, what path expression will give me the "<p>" element with the "Hello" text following the <P> with the <Strong> text below …
<p>
<strong>About this event:</strong>
…
</p>
<p>Hello</p>
? -

//p[strong[.='About this event:']]/following-sibling::p
or
//p[strong[.='About this event:']]/following-sibling::p[.='Hello']

Try this:
//p[normalize-space(strong)='About this event:']/following-sibling::p
You can also narrow this down to the first p by adding [1]:
//p[normalize-space(strong)='About this event:']/following-sibling::p[1]

I believe this should be do-able. Here's the output from Perl's xpath tool:
so zacharyyoung$ xpath so.xml '//p[strong = "About this event:"]/following-sibling::p[1]'
Found 1 nodes:
-- NODE --
<p>Hello</p>
The XPath is //p[strong = "About this event:"]/following-sibling::p[1]

Related

xPath - Why is this exact text selector not working with the data test id?

I have a block of code like so:
<ul class="open-menu">
<span>
<li data-testid="menu-item" class="menu-item option">
<svg>...</svg>
<div>
<strong>Text Here</strong>
<small>...</small>
</div>
</li>
<li data-testid="menu-item" class="menu-item option">
<svg>...</svg>
<div>
<strong>Text</strong>
<small>...</small>
</div>
</li>
</span>
</ul>
I'm trying to select a menu item based on exact text like so in the dev tools:
$x('.//*[contains(#data-testid, "menu-item") and normalize-space() = "Text"]');
But this doesn't seem to be selecting the element. However, when I do:
$x('.//*[contains(#data-testid, "menu-item")]');
I can see both of the menu items.
UPDATE:
It seems that this works:
$x('.//*[contains(#class, "menu-item") and normalize-space() = "Text"]');
Not sure why using a class in this context works and not a data-testid. How can I get my xpath selector to work with my data-testid?
Why is this exact text selector not working
The fact that both li elements are matched by the XPath expression
if omitting the condition normalize-space() = "Text" is a clue.
normalize-space() returns ... Text Here ... for the first li
in the posted XML and ... Text ... for the second (or some other
content in place of ... from div/svg or div/small) causing
normalize-space() = "Text" to fail.
In an update you say the same condition succeeds. This has nothing to
do with using #class instead of #data-testid; it must be triggered
by some content change.
How can I get my xpath selector to work with my data-testid?
By testing for an exact text match in the li's descendant strong
element,
.//*[#data-testid = "menu-item" and div/strong = "Text"]
which matches the second li. Making the test more robust is usually
in order, e.g.
.//*[contains(#data-testid,"menu-item") and normalize-space(div/strong) = "Text"]
Append /div/small or /descendant::small, for example, to the XPath
expression to extract just the small text.
data-testid="menu-item" is matching both the outer li elements while text content you are looking for is inside the inner strong element.
So, to locate the outer li element based on it's data-testid attribute value and it's inner strong element text value you can use XPath expression like this:
//*[contains(#data-testid, "menu-item") and .//normalize-space() = "Text"]
Or
.//*[contains(#data-testid, "menu-item") and .//*[normalize-space() = "Text"]]
I have tested, both expressions are working correctly

xpath:how to find a node that not contains text?

I have a html like:
...
<div class="grid">
"abc"
<span class="searchMatch">def</span>
</div>
<div class="grid">
<span class="searchMatch">def</span>
</div>
...
I want to get the div which not contains text,but xpath
//div[#class='grid' and text()='']
seems doesn't work,and if I don't know the text that other divs have,how can I find the node?
Let's suppose I have inferred the requirement correctly as:
Find all <div> elements with #class='grid' that have no directly-contained non-whitespace text content, i.e. no non-whitespace text content unless it's within a child element like a <span>.
Then the answer to this is
//div[#class='grid' and not(text()[normalize-space(.)])]
You need a not() statement + normalize-space() :
//div[#class='grid' and not(normalize-space(text()))]
or
//div[#class='grid' and normalize-space(text())='']

Xpath getting text with mixed elements in same div

Here is some sample HTML
<div class="something">
<p> This is a <b> Paragraph </b> with mixed elements
<p> Next paragraph....
</div>
what I tried was
//div[contains('#class','something')/text()
and
//div[contains('#class','something')/*/text()
and
//div[contains('#class','something')/p/text()
all of these seem to skip the 'b' tags and the 'a' tags.
Try " ".join(sel.xpath("//div[contains(#class,'something')]//text()").extract()) where sel is selector in your case may be response.
Use the XPath expression
//div[contains(#class,'something')]//text()
to get a concatenation of the text of all the text() nodes in the chosen div element.
Output:
This is a Paragraph with mixed elements
Next paragraph....
It depends on what and how you want to obtain. Anyway, there are couple of problems with what you tried:
You are missing closing bracket (]) after contains in the XPath expression.
#class should not be enclosed in (single) quotes when used inside contains.
If you want to get all the text of div element as one string, you might use
normalize-space(//div[contains(#class,'something')])

How to find xpath of a text element without node

<h1>
<span class="visually-hidden">BBC Radio</span>
Search results for 'archers'
</h1>
I want to locate the text element "Search results for 'archers'" . What will be the xpath that will locate to it and not to the element in span node ??
For your input sample
/h1/text()
Tested on http://videlibri.sourceforge.net/cgi-bin/xidelcgi
returns Search results for 'archers'

Xpath - matching based on node() contains() content

I have the following HTML structure (there are many blocks using the same architecture):
<span id="mySpan">
<i>
Price
<b>
3 900
<small>€</small>
</b>
</i>
</span>
Now, I want to get the content of <b> using Xpath which I tried like so:
//span[#id="mySpan"]/i/node()[1][contains(text(),"Price")]
which does match anything. How can I match this using the node()[1] text as anchor?
Regarding the Xpath you tried, instead of text() which return text node child, simply use . :
//span[#id="mySpan"]/i/node()[1][contains(.,"Price")]
For the ultimate goal, I'd suggest this XPath :
//span[#id="mySpan"]/i[contains(.,"Price")]/b
or if you want specifically to match against the first node within <i> :
//span[#id="mySpan"]/i[contains(node(),"Price")]/b

Resources