XPath selector for tag with specific descendant tags selects other tags - xpath

Given a document:
<html>
<body>
<div>
<div>No span</div>
<span>Target</span>
</div>
</body>
</html>
I would like to select the <div> containing the <span>. However, when I use this selector:
//div[//span]
It matches both <div>s:
<div><div>No span</div><span>Target</span></div> <-- what I wanted
<div>No span</div> <-- this is also matched
I tested this on Google Chrome's Devtools, as well as several online XPath evaluators, so I assume this is the correct behavior.
Why is this happening, and how can I fix my selector?

select the <div> containing the <span>
Use relative paths.
//div[.//span]
// starts from the document root. .// starts from the context element.
Predicates evaluate to true when the contained expression selects nodes. This means that //div[//span] is always true when there is a <span> anywhere in the document, in which case all <div>s in the document will be selected. //div[.//span] is only true when there is a <span> anywhere in the respective <div>.
If you mean "has a <span> child" (as opposed to "has a <span> descendant") this will work:
//div[span]
which is a shorthand for this (to underline the difference between / and //):
//div[./span]

Related

How to get descendants with a specific tag name and text in protractor?

I have the following structure (it's just for sample). In protractor, I am getting the top element by id. However, the other elements do not have id's. I need to get the "label" element that contains the text '20'. Is there an easy way in protractor to select the element with a specific tag that contains a specific text from all the descendants of a parent element?
<pc-selector _... id="Number1">
<div ...></div>
<div ...>
<div ...>
<check-box _...>
<div _ngcontent-c25="" ...>
<label _ngcontent-c25="">
<input _ngcontent-c25="" type="checkbox">
<span _ngcontent-c25="" class="m-checkbox__marker"></span>
20 More text to follow</label>
</div>
</check-box>
</div>
</div>
</pc-selector>
I could't find anythitng, so I have tried with xpath, but protractor complains that my xpath is invalid:
parentElement = element(by.id('Number1'));
return parentElement.element(by.xpath(".//label[contains(text(),'20'))]"));
Any ideas?
You have an additional bracket in your [contains(text(),'20'))] which is likely causing you issue but there are multiple other ways this can be achieved using a single XPath or chaining other locators.
The process is that you must find the div with the correct id first and then locate the label that is a child of it.
//Xpath
element(by.xpath("//pc-selector[#id='Number1']//label[contains(text(),'20')]"));
//Chained CSS
element(by.id('Number1')).element(by.cssContainingText('label','20'));
You also may be interested to learn about xpath axes which can allow us to do very dynamic selection.
You can use the direct xpath to access the label.
element(by.xpath("//*[#id='Number1']//label"));

How can I select nodes that don't contain links but which do contain specific text using xpath

Given the following HTML:
$content =
'<html>
<body>
<div>
<p>During the interim there shall be nourishment supplied</p>
</div>
<div>
<p>During the interim there shall be interim nourishment supplied</p>
</div>
<div>
<ul><li>During the interim there shall be nourishment supplied</li></ul>
</div>
</body>
</html>';
I want all the nodes containing the word "interim" but not if the word "interim" is part of a link element.
The nodes I would expect back are the first P node and the LI node only.
I've tried the following:
'//*/text()[not(a) and contains(.,"interim")]'
... but this still returns the A and also returns part of it's parent P node (the part after the A), neither of which are desired. You can see my attempt here: https://glot.io/snippets/ehp7hmmglm
If you use the XPath expression //*[not(self::a) and not(a) and text()[contains(.,"interim")]] then you get all elements that do not contain an a element, are not a elements and contain a text node child containing that word.

Xpath: select div that contains class AND whose specific child element contains text

With the help of this SO question I have an almost working xpath:
//div[contains(#class, 'measure-tab') and contains(., 'someText')]
However this gets two divs: in one it's the child td that has someText, the other it's child span.
How do I narrow it down to the one with the span?
<div class="measure-tab">
<!-- table html omitted -->
<td> someText</td>
</div>
<div class="measure-tab"> <-- I want to select this div (and use contains #class)
<div>
<span> someText</span> <-- that contains a deeply nested span with this text
</div>
</div>
To find a div of a certain class that contains a span at any depth containing certain text, try:
//div[contains(#class, 'measure-tab') and contains(.//span, 'someText')]
That said, this solution looks extremely fragile. If the table happens to contain a span with the text you're looking for, the div containing the table will be matched, too. I'd suggest to find a more robust way of filtering the elements. For example by using IDs or top-level document structure.
You can use ancestor. I find that this is easier to read because the element you are actually selecting is at the end of the path.
//span[contains(text(),'someText')]/ancestor::div[contains(#class, 'measure-tab')]
You could use the xpath :
//div[#class="measure-tab" and .//span[contains(., "someText")]]
Input :
<root>
<div class="measure-tab">
<td> someText</td>
</div>
<div class="measure-tab">
<div>
<div2>
<span>someText2</span>
</div2>
</div>
</div>
</root>
Output :
Element='<div class="measure-tab">
<div>
<div2>
<span>someText2</span>
</div2>
</div>
</div>'
You can change your second condition to check only the span element:
...and contains(div/span, 'someText')]
If the span isn't always inside another div you can also use
...and contains(.//span, 'someText')]
This searches for the span anywhere inside the div.

What is Valid Xpath for link extract by div class name?

What is Valid Xpath for link extract by div class name?
Here is html code:
<div class="poster">
<a href="/title/tt2091935/mediaviewer/rm4278707200?ref_=tt_ov_i"> <img alt="Mr. Right Poster" title="Mr. Right Poster" src="http://ia.media-imdb.com/images/M/MV5BOTcxNjUyOTMwOV5BMl5BanBnXkFtZTgwMzUxMDk4NzE#._V1_UX182_CR0,0,182,268_AL_.jpg" itemprop="image">
</a> </div>
I want to know exact Xpath as if i found href link.
I try with //a/#href[#class='poster'] but it's doesn't work
The <div> contains the <a> so you can use that to navigate:
//div[#class='poster']/a/#href
Remember that the "poster" class is defined on the <div> not on the <a> so that's where you need to apply the predicate.
//div returns all <div> elements
[#class='poster'] is a predicate that filters by class
/a returns all <a> elements that are children of those <div>s
/#href gives us the attribute we want
Depending on the system you're using you might need to wrap the whole expression in text() in order to bring back the attribute data rather than the DOM node.

Using XPath expression how can i get the first text node immediately following a node?

I want to get to the exact node having this text: 'Company'. Once I get to this node I want to get to the next text node immediately following this node because this contains the company name. How can I do this with Xpath?
Fragment of XML is:
<div id="jobsummary">
<div id="jobsummary_content">
<h2>Job Summary</h2>
<dl>
<dt>Company</dt>
<!-- the following element is the one I'm looking for -->
<dd><span class="wrappable">Pinpoint IT Services, LLC</span></dd>
<dt>Location</dt>
<dd><span class="wrappable">Newport News, VA</span></dd>
<dt>Industries</dt>
<dd><span class="wrappable">All</span></dd>
<dt>Job Type</dt>
<dd class="multipledd"><span class="wrappable">Full Time</span></dd><dd class="multipleddlast"><span class="wrappable"> Employee</span></dd>
</dl>
</div>
</div>
I got to the Company tag with following xpath: //*[text()= 'Company']
Now I want to get to the next text node. My XML is dynamic. So I can't hardcode the node type like <dd> for getting the company value. But this is for sure that the value be in the immediate next text node.
So how can I get to the text node immediately after the node with text as Company?
If you cannot hardcode any part of the following-sibling node your xpath should look like this:
//*[text()='Company']/following::*/*/text()
assuming that the desired text is always enclosed in another element like span.
To test for given dt text, modify your xpath to
//*[text()='Company' or text()='Company:' or text()='Company Name']/following::*/*/text()
use //*[text()='Company']/following-sibling::dd to get the next dd.
You can even insert conditions for that dd and also go further in it.
following-sibling::elementName just looks for the next sibling at the same parent level that meets your requirements.
With no conditions, like above, it will get the next dd after the 'Company'.
The text is in the span so you might try
//*[text()='Company']/following-sibling::dd/span
Another clarifying example would be, let's say that you want to get also the next industries text for the current selected 'Company'.
Having //*[text()='Company',
you can modify it like this: //*[text()='Company']/following-sibling::dt[text()='Industries']/dd/span
Of course, instead of hardcoding the values for text(), you can use variables.
You can Use XPathNavigator and go on to every node type one by one
I think XPathNavigator::MoveToNext is the method you are looking for.
There is the sample code as well at..
http://msdn.microsoft.com/en-us/library/9yxc3x24.aspx
Use this general XPath expression that selects the wanted text node even when it is wrapped in statically unknown markup elements:
(//*[text()='Company']/following-sibling::*[1]//text())[1]
When this XPath expression is evaluated against the provided XML document:
<div id="jobsummary">
<div id="jobsummary_content">
<h2>Job Summary</h2>
<dl>
<dt>Company</dt>
<!-- the following element is the one I'm looking for -->
<dd><span class="wrappable">Pinpoint IT Services, LLC</span></dd>
<dt>Location</dt>
<dd><span class="wrappable">Newport News, VA</span></dd>
<dt>Industries</dt>
<dd><span class="wrappable">All</span></dd>
<dt>Job Type</dt>
<dd class="multipledd"><span class="wrappable">Full Time</span></dd><dd class="multipleddlast"><span class="wrappable"> Employee</span></dd>
</dl>
</div>
</div>
exactly the wanted text node is selected:
Pinpoint IT Services, LLC
Even if we change the XML to this one:
<div id="jobsummary">
<div id="jobsummary_content">
<h2>Job Summary</h2>
<div>
<p>Company</p>
<!-- the following element is the one I'm looking for -->
<dd><span class="wrappable"><b><i><u>Pinpoint IT Services, LLC</u></i></b></span></dd>
<dt>Location</dt>
<dd><span class="wrappable">Newport News, VA</span></dd>
<dt>Industries</dt>
<dd><span class="wrappable">All</span></dd>
<dt>Job Type</dt>
<dd class="multipledd"><span class="wrappable">Full Time</span></dd><dd class="multipleddlast"><span class="wrappable"> Employee</span></dd>
</div>
</div>
</div>
the XPath expression above still selects the wanted text node:
Pinpoint IT Services, LLC

Resources