As per related question answered I tried,
.//*[contains(#class, 'info') and contains(., 'someText')]
but I get the error invalid xpath.
How can I get this class or the span element from this ?
<p class="info"> <-- I want to select this class
<span> someText</span> <-- that contains span with this text
</p>
xpath expression:
//*[contains(#class, 'info') and ./span[contains(., 'someText')]]
Related
I have a block of code like so:
<ul class="open-menu">
<span>
<li data-testid="menu-item" class="menu-item option">
<svg>...</svg>
<div>
<strong>Text Here</strong>
<small>...</small>
</div>
</li>
<li data-testid="menu-item" class="menu-item option">
<svg>...</svg>
<div>
<strong>Text</strong>
<small>...</small>
</div>
</li>
</span>
</ul>
I'm trying to select a menu item based on exact text like so in the dev tools:
$x('.//*[contains(#data-testid, "menu-item") and normalize-space() = "Text"]');
But this doesn't seem to be selecting the element. However, when I do:
$x('.//*[contains(#data-testid, "menu-item")]');
I can see both of the menu items.
UPDATE:
It seems that this works:
$x('.//*[contains(#class, "menu-item") and normalize-space() = "Text"]');
Not sure why using a class in this context works and not a data-testid. How can I get my xpath selector to work with my data-testid?
Why is this exact text selector not working
The fact that both li elements are matched by the XPath expression
if omitting the condition normalize-space() = "Text" is a clue.
normalize-space() returns ... Text Here ... for the first li
in the posted XML and ... Text ... for the second (or some other
content in place of ... from div/svg or div/small) causing
normalize-space() = "Text" to fail.
In an update you say the same condition succeeds. This has nothing to
do with using #class instead of #data-testid; it must be triggered
by some content change.
How can I get my xpath selector to work with my data-testid?
By testing for an exact text match in the li's descendant strong
element,
.//*[#data-testid = "menu-item" and div/strong = "Text"]
which matches the second li. Making the test more robust is usually
in order, e.g.
.//*[contains(#data-testid,"menu-item") and normalize-space(div/strong) = "Text"]
Append /div/small or /descendant::small, for example, to the XPath
expression to extract just the small text.
data-testid="menu-item" is matching both the outer li elements while text content you are looking for is inside the inner strong element.
So, to locate the outer li element based on it's data-testid attribute value and it's inner strong element text value you can use XPath expression like this:
//*[contains(#data-testid, "menu-item") and .//normalize-space() = "Text"]
Or
.//*[contains(#data-testid, "menu-item") and .//*[normalize-space() = "Text"]]
I have tested, both expressions are working correctly
Hello I have this HTML:
<div class="_3Vhpd"><span>Your commerce Data</span>
<a class="n3G0C" href='http://www.webadress.......'><span>Some Text</span</a>
</div>
I tried to obtain the tag as follow:
parser.xpath('//div[contains(#class,"_3Vhpd")]//following-sibling::*[a[#class="n3G0C"]]/#href ')
but I received none '[]'. Maybe because is not just after div but after a span...
First, you sample html doesn't have a class="n3G0C", but assuming you fix it, this xpath expression should work:
//div[contains(#class,"_3Vhpd")]//following-sibling::a/#href
Output:
http://www.webadress.......
How do i get the text of a child element, if the parent element contains text with a specific string?
For example:
<li>
"string1"
<span>
"Hello"
</span>
</li>
<li>
"string2"
<span>
"Ola"
</span>
</li>
From the above html code, how to get only string "Ola" using xpath?
Without knowing scrapy, I would try
//li[text()[contains(.,"string2")]]/span/text()
//li[text()[contains(.,"string2")]] select a li element that text contains string2
/span select a element span below the selected li
/text(): return the text of the selected span element
Update: This is simpler and should also work:
//li[contains(text(),"string2")]/span/text()
I have a html like:
...
<div class="grid">
"abc"
<span class="searchMatch">def</span>
</div>
<div class="grid">
<span class="searchMatch">def</span>
</div>
...
I want to get the div which not contains text,but xpath
//div[#class='grid' and text()='']
seems doesn't work,and if I don't know the text that other divs have,how can I find the node?
Let's suppose I have inferred the requirement correctly as:
Find all <div> elements with #class='grid' that have no directly-contained non-whitespace text content, i.e. no non-whitespace text content unless it's within a child element like a <span>.
Then the answer to this is
//div[#class='grid' and not(text()[normalize-space(.)])]
You need a not() statement + normalize-space() :
//div[#class='grid' and not(normalize-space(text()))]
or
//div[#class='grid' and normalize-space(text())='']
With the help of this SO question I have an almost working xpath:
//div[contains(#class, 'measure-tab') and contains(., 'someText')]
However this gets two divs: in one it's the child td that has someText, the other it's child span.
How do I narrow it down to the one with the span?
<div class="measure-tab">
<!-- table html omitted -->
<td> someText</td>
</div>
<div class="measure-tab"> <-- I want to select this div (and use contains #class)
<div>
<span> someText</span> <-- that contains a deeply nested span with this text
</div>
</div>
To find a div of a certain class that contains a span at any depth containing certain text, try:
//div[contains(#class, 'measure-tab') and contains(.//span, 'someText')]
That said, this solution looks extremely fragile. If the table happens to contain a span with the text you're looking for, the div containing the table will be matched, too. I'd suggest to find a more robust way of filtering the elements. For example by using IDs or top-level document structure.
You can use ancestor. I find that this is easier to read because the element you are actually selecting is at the end of the path.
//span[contains(text(),'someText')]/ancestor::div[contains(#class, 'measure-tab')]
You could use the xpath :
//div[#class="measure-tab" and .//span[contains(., "someText")]]
Input :
<root>
<div class="measure-tab">
<td> someText</td>
</div>
<div class="measure-tab">
<div>
<div2>
<span>someText2</span>
</div2>
</div>
</div>
</root>
Output :
Element='<div class="measure-tab">
<div>
<div2>
<span>someText2</span>
</div2>
</div>
</div>'
You can change your second condition to check only the span element:
...and contains(div/span, 'someText')]
If the span isn't always inside another div you can also use
...and contains(.//span, 'someText')]
This searches for the span anywhere inside the div.