XPath exclude given class - xpath

I'm trying to extract text from a div but excluding a given class:
This is what i'm trying:
$pattern = "//div/#title[not(contains (#class, 'second_card local_impact_icon impact-2'))]";
but its not excluding the given class, i need to extract just the text of title='' but just from the first div title.
This is the html:
<div class="match_info"><div title='Yellow Card' class='local_impact_icon impact-1'></div><div title='Red Card' class='second_card local_impact_icon impact-2'></div></div>

Following XPath
//div/div[not(contains (#class, 'second_card local_impact_icon impact-2'))]/#title
returns
title="Yellow Card"
Simplified explanation - just select the div that doesn't contain the class you want to exclude and retrieve the title attribute for this div only. When you set this exclude at the position ../#title you already are at the title-attributes of both divs.
And as the question is how to retrieve the text - in given example
string(//div/div[not(contains (#class, 'second_card local_impact_icon impact-2'))]/#title)
returns Yellow Card

Related

xPath: fetch element with an attribute containing the text of another element

Given I have the following HTML structure:
<button aria-labelledby="ref-1" id="foo" onclick="convey(event)">action 2</button>
<div class="anotherElement">foobar</div>
<div id="ref-1" hidden>target 2</div>
I would like to fetch button by its aria-labelledby attribute. I tried the following options:
//*[#aria-labelledby=string(/div[#id="ref-1"]/#id)]
//*[#aria-labelledby = string(.//*[normalize-space() = "target 2"]/#id)]
//*[#aria-labelledby = .//*[normalize-space() = "target 2"]/#id]
But wasn't able to fetch the element. Anyone has an idea what the right xPath could be?
Edit: simply put: how do I fetch the button element if my only information is "target 2", and if both elements can be randomly located?
//button[#aria-labelledby='ref-1']
or
//button[#aria-labelledby=(//*/#id)]
or
//button[#aria-labelledby=(//*[contains(.,'target 2')]/#id)]
or
//button[#aria-labelledby=(//*[contains(text(),'target 2')]/#id)]
?
Since button and div are the same level siblings here you can use preceding-sibling XPath expression like this:
//div[text()='target 2']//preceding-sibling::button
pay attention with with your actual XML this will match 2 button elements.
To make more precise math I think we will need to be based on more details, not only the target 2 text

Xpath get element above

suppose I have this structure:
<div class="a" attribute="foo">
<div class="b">
<span>Text Example</span>
</div>
</div>
In xpath, I would like to retrieve the value of the attribute "attribute" given I have the text inside: Text Example
If I use this xpath:
.//*[#class='a']//*[text()='Text Example']
It returns the element span, but I need the div.a, because I need to get the value of the attribute through Selenium WebDriver
Hey there are lot of ways by which you can figure it out.
So lets say Text Example is given, you can identify it using this text:-
//span[text()='Text Example']/../.. --> If you know its 2 level up
OR
//span[text()='Text Example']/ancestor::div[#class='a'] --> If you don't know how many level up this `div` is
Above 2 xpaths can be used if you only want to identify the element using Text Example, if you don't want to iterate through this text. There are simple ways to identify it directly:-
//div[#class='a']
From your question itself you have mentioned the answer for it
but I need the div.a,
try this
driver.findElement(By.cssSelector("div.a")).getAttribute("attribute");
use cssSelector for best result.
or else try the following xpath
//div[contains(#class, 'a')]
If you want attribute of div.a with it's descendant span which contains text something, try as below :-
driver.findElement(By.xpath("//div[#class = 'a' and descendant::span[text() = 'Text Example']]")).getAttribute("attribute");
Hope it helps..:)

XPath - Nested path scraping

I'm trying to perform html scrapping of a webpage. I like to fetch the three alternate text (alt - highlighted) from the three "img" elements.
I'm using the following code extract the whole "img" element of slide-1.
from lxml import html
import requests
page = requests.get('sample.html')
tree = html.fromstring(page.content)
text_val = tree.xpath('//a[class="cover-wrapper"][id = "slide-1"]/text()')
print text_val
I'm not getting the alternate text values displayed. But it is an empty list.
HTML Script used:
This is one possible XPath :
//div[#id='slide-1']/a[#class='cover-wrapper']/img/#alt
Explanation :
//div[#id='slide-1'] : This part find the target <div> element by comparing the id attribute value. Notice the use #attribute_name syntax to reference attribute in XPath. Missing the # symbol would change the XPath selector meaning to be referencing a -child- element with the same name, instead of an attribute.
/a[#class='cover-wrapper'] : from each <div> element found by the previous bit of the XPath, find child element <a> that has class attribute value equals 'cover-wrapper'
/img/#alt : then from each of such <a> elements, find child element <img> and return its alt attribute
You might want to change the id filter to be starts-with(#id,'slide-') if you meant to return the all 3 alt attributes in the screenshot.
Try this:
//a[#class="cover-wrapper"]/img/#alt
So, I am first selecting the node having a tag and class as cover-wrapper and then I select the node img and then the attribute alt of img.
To find the whole image element :
//a[#class="cover-wrapper"]
I think you want:
//div[#class="showcase-wrapper"][#id="slide-1"]/a/img/#alt

What XPATH I need to extract the text inside SPAN that is preceded by a specific label inside a STRONG, both inside a P?

What XPATH I need to extract the text inside SPAN that is preceded by a specific label inside a STRONG, both inside a P?
For example to extract website and email addresses from a page that looks like this:
<p>
<strong>Website:</strong>
<span>www.example.com</span>
</p>
<p>
<strong>Contact email:</strong>
<span>email#example.com</span>
</p>
This shall do:
//p/span[preceding::*[1][self::strong and . = 'Contact email:']]
Here, you are selecting all p/span elements with first preceding element strong, where label is Contact email:
Website:
//p/span[preceding::strong[1]/text()='Website:']
Email:
//p/span[preceding::strong[1]/text()='Contact email:']
It is also important to note that, by using preceding axes as shown in the other two answers, the XPath will mistakenly return span element that is formed like the following :
<strong>Website:</strong>
<p>
<span>www.example.com</span>
</p>
You can use preceding-sibling axes instead to avoid the mistake mentioned above :
//p/span[preceding-sibling::*[1][self::strong and . = 'Website:']]
preceding-sibling axes only consider elements that is located before context element (the span in this case), and is sibling (share the same parent) of the context element.

Retrieving a parent tag with a given attribute that contains a subelement by using XPath

How I can retrieve multiple DIVs (with a given class attribute "a") that contain a span tag with a class attribute "b" by using Xpath?
<div class='a'>
<span class='b'/>
</div>
The structure of my XML is not defined so basically the span could be at any level of the div and the div itself could be at any level of the XML tree.
This should work:
//div[#class='a'][span/#class='b']
// means search anywhere if it starts the expression.
If the span is deeper in the div, use descendant:: which can be shortened to // again:
//div[#class='a'][.//span/#class='b']

Resources