Can't get containts from xpath codeception - xpath

I have element
<a href="/s-xQ6qeR/documents/download?revid=28">
<span class="icon icon-file-pdf-o" style="vertical-align: middle"></span> test_upload_uwfacjtn.pdf
</a>
I need to check this element on page
I try do it:
$fileHref = $this->I->grabAttributeFrom("//a[contains(., 'test_upload_uwfacjtn.pdf')]", 'href');
But I got error:
Step Grab attribute from "//a[contains(.,
'test_upload_uwfacjtn.pdf')]","href" Fail Element that matches CSS
or XPath element with '//a[contains(., 'test_upload_uwfacjtn.pdf')]'
was not found.

I finded two way to check the text inside an html tag :
1. Using the method grabAttributeFrom and then compare the result
$fileName = $I->grabTextFrom('//a[#href="/s-xQ6qeR/documents/download?revid=28"]/span');
$I->assertEquals('test_upload_uwfacjtn.pdf', $fileName);
This can be usefull if you want to put the result inside a variable and use it for other tests later.
2. Using method seeElement with the text to compare inside your xpath
$I->seeElement('//span[text()="test_upload_uwfacjtn.pdf"]');

Related

xPath: fetch element with an attribute containing the text of another element

Given I have the following HTML structure:
<button aria-labelledby="ref-1" id="foo" onclick="convey(event)">action 2</button>
<div class="anotherElement">foobar</div>
<div id="ref-1" hidden>target 2</div>
I would like to fetch button by its aria-labelledby attribute. I tried the following options:
//*[#aria-labelledby=string(/div[#id="ref-1"]/#id)]
//*[#aria-labelledby = string(.//*[normalize-space() = "target 2"]/#id)]
//*[#aria-labelledby = .//*[normalize-space() = "target 2"]/#id]
But wasn't able to fetch the element. Anyone has an idea what the right xPath could be?
Edit: simply put: how do I fetch the button element if my only information is "target 2", and if both elements can be randomly located?
//button[#aria-labelledby='ref-1']
or
//button[#aria-labelledby=(//*/#id)]
or
//button[#aria-labelledby=(//*[contains(.,'target 2')]/#id)]
or
//button[#aria-labelledby=(//*[contains(text(),'target 2')]/#id)]
?
Since button and div are the same level siblings here you can use preceding-sibling XPath expression like this:
//div[text()='target 2']//preceding-sibling::button
pay attention with with your actual XML this will match 2 button elements.
To make more precise math I think we will need to be based on more details, not only the target 2 text

xPath - Why is this exact text selector not working with the data test id?

I have a block of code like so:
<ul class="open-menu">
<span>
<li data-testid="menu-item" class="menu-item option">
<svg>...</svg>
<div>
<strong>Text Here</strong>
<small>...</small>
</div>
</li>
<li data-testid="menu-item" class="menu-item option">
<svg>...</svg>
<div>
<strong>Text</strong>
<small>...</small>
</div>
</li>
</span>
</ul>
I'm trying to select a menu item based on exact text like so in the dev tools:
$x('.//*[contains(#data-testid, "menu-item") and normalize-space() = "Text"]');
But this doesn't seem to be selecting the element. However, when I do:
$x('.//*[contains(#data-testid, "menu-item")]');
I can see both of the menu items.
UPDATE:
It seems that this works:
$x('.//*[contains(#class, "menu-item") and normalize-space() = "Text"]');
Not sure why using a class in this context works and not a data-testid. How can I get my xpath selector to work with my data-testid?
Why is this exact text selector not working
The fact that both li elements are matched by the XPath expression
if omitting the condition normalize-space() = "Text" is a clue.
normalize-space() returns ... Text Here ... for the first li
in the posted XML and ... Text ... for the second (or some other
content in place of ... from div/svg or div/small) causing
normalize-space() = "Text" to fail.
In an update you say the same condition succeeds. This has nothing to
do with using #class instead of #data-testid; it must be triggered
by some content change.
How can I get my xpath selector to work with my data-testid?
By testing for an exact text match in the li's descendant strong
element,
.//*[#data-testid = "menu-item" and div/strong = "Text"]
which matches the second li. Making the test more robust is usually
in order, e.g.
.//*[contains(#data-testid,"menu-item") and normalize-space(div/strong) = "Text"]
Append /div/small or /descendant::small, for example, to the XPath
expression to extract just the small text.
data-testid="menu-item" is matching both the outer li elements while text content you are looking for is inside the inner strong element.
So, to locate the outer li element based on it's data-testid attribute value and it's inner strong element text value you can use XPath expression like this:
//*[contains(#data-testid, "menu-item") and .//normalize-space() = "Text"]
Or
.//*[contains(#data-testid, "menu-item") and .//*[normalize-space() = "Text"]]
I have tested, both expressions are working correctly

How to scarpe the href using Nokogiri

I have a variable e which stores a Nokogiri::XML::Element object.
when I execute puts e I get on the screen the following:
<h3 class="fixed-recipe-card__h3">
<a href="https://www.allrecipes.com/recipe/21712/chocolate-covered-strawberries/" data-content-provider-id="" data-internal-referrer-link="hub recipe" class="fixed-recipe-card__title-link">
<span class="fixed-recipe-card__title-link">Chocolate Covered Strawberries</span>
</a>
</h3>
I would like to scrape this part https://www.allrecipes.com/recipe/21712/chocolate-covered-strawberries/
How can I do this using Nokogiri
If you want to extract the link, you can use:
e.at_css("a").attributes["href"].value
.at_css returns the first element matching the CSS selector (another Nokogiri::XML::Element). To get a list of all matching elements, use .css instead.
.attributes gives you a hash mapping attribute name to Nokogiri::XML::Attr. Once you look up the desired attribute in this hash (href), you can call .value to get the actual text value.

What is Valid Xpath for link extract by div class name?

What is Valid Xpath for link extract by div class name?
Here is html code:
<div class="poster">
<a href="/title/tt2091935/mediaviewer/rm4278707200?ref_=tt_ov_i"> <img alt="Mr. Right Poster" title="Mr. Right Poster" src="http://ia.media-imdb.com/images/M/MV5BOTcxNjUyOTMwOV5BMl5BanBnXkFtZTgwMzUxMDk4NzE#._V1_UX182_CR0,0,182,268_AL_.jpg" itemprop="image">
</a> </div>
I want to know exact Xpath as if i found href link.
I try with //a/#href[#class='poster'] but it's doesn't work
The <div> contains the <a> so you can use that to navigate:
//div[#class='poster']/a/#href
Remember that the "poster" class is defined on the <div> not on the <a> so that's where you need to apply the predicate.
//div returns all <div> elements
[#class='poster'] is a predicate that filters by class
/a returns all <a> elements that are children of those <div>s
/#href gives us the attribute we want
Depending on the system you're using you might need to wrap the whole expression in text() in order to bring back the attribute data rather than the DOM node.

Xpath get text of nested item not working but css does

I'm making a crawler with Scrapy and wondering why my xpath doesn't work when my CSS selector does? I want to get the number of commits from this html:
<li class="commits">
<a data-pjax="" href="/samthomson/flot/commits/master">
<span class="octicon octicon-history"></span>
<span class="num text-emphasized">
521
</span>
commits
</a>
</li
Xpath:
response.xpath('//li[#class="commits"]//a//span[#class="text-emphasized"]//text()').extract()
CSS:
response.css('li.commits a span.text-emphasized').css('::text').extract()
CSS returns the number (unescaped), but XPath returns nothing. Am I using the // for nested elements correctly?
You're not matching all values in the class attribute of the span tag, so use the contains function to check if only text-emphasized is present:
response.xpath('//li[#class="commits"]//a//span[contains(#class, "text-emphasized")]//text()')[0].strip()
Otherwise also include num:
response.xpath('//li[#class="commits"]//a//span[#class="num text-emphasized"]//text()')[0].strip()
Also, I use [0] to retrieve the first element returned by XPath and strip() to remove all whitespace, resulting in just the number.

Resources