XPath valid in Firefox but not in Chrome - xpath

I am trying to find a menu element via XPath in the JupyterLab UI; The following is an extract of the list of elements in the menu I am interested in, and should be a good minimal example of my problem:
<li tabindex="0" aria-disabled="true" role="menuitem" class="lm-Menu-item p-Menu-item lm-mod-disabled p-mod-disabled lm-mod-hidden p-mod-hidden" data-type="command" data-command="filemenu:logout">
<div class="f1vya9e0 lm-Menu-itemIcon p-Menu-itemIcon jp-Icon"></div>
<div class="lm-Menu-itemLabel p-Menu-itemLabel">Log Out</div>
<div class="lm-Menu-itemShortcut p-Menu-itemShortcut"></div>
<div class="lm-Menu-itemSubmenuIcon p-Menu-itemSubmenuIcon"></div>
</li>
<li tabindex="0" role="menuitem" class="lm-Menu-item p-Menu-item" data-type="command" data-command="hub:logout"><div class="f1vya9e0 lm-Menu-itemIcon p-Menu-itemIcon jp-Icon">
<div class="f1vya9e0 lm-Menu-itemIcon p-Menu-itemIcon jp-Icon"></div>
<div class="lm-Menu-itemLabel p-Menu-itemLabel">Log Out</div>
<div class="lm-Menu-itemShortcut p-Menu-itemShortcut"></div>
<div class="lm-Menu-itemSubmenuIcon p-Menu-itemSubmenuIcon"></div>
</li>
As you can see, both <li> items contain a <div> with the text Log Out, which is my main problem, as I am trying to write a general Xpath expression that can work for any Menu item. What I am currently trying to use is:
//div[contains(#class, 'p-Menu-itemLabel')][text() = '${item}']
Where ${item} can be any menu item, as all <li> items will have a similar div with text in them. The problem arises with the Log Out item, which is the only one that is repeated twice. In order to handle this special case, I have though of using
//div[contains(#class, 'p-Menu-itemLabel')][text() = 'Log Out']/..[not(contains(#class,'p-mod-hidden'))]
Since either one of the two <li> items will not contain that specific class (i.e., the currently active Log Out element).
This XPath works fine in Firefox and finds the element I am looking for everytime, however Chrome complains that it is not a valid XPath expression. Somehow this reduced version:
//div[contains(#class, 'p-Menu-itemLabel')][text() = 'Log Out']/..
works in Chrome, but any time I try to use an attribute selector on the parent element (i.e. /..[something]) it fails to recognize it as a valid XPath.
Does anyone have any idea of why? And what can I do to make Chrome recognize it as a valid XPath?

It seems that Chrome doesn't like applying a predicate directly from the .. parent axis.
But you can modify to use the long form: parent::*
//div[contains(#class, 'p-Menu-itemLabel')][text() = 'Log Out']/parent::*[not(contains(#class,'p-mod-hidden'))]
Or apply the self::* axis and then apply the predicate:
//div[contains(#class, 'p-Menu-itemLabel')][text() = 'Log Out']/../self::*[not(contains(#class,'p-mod-hidden'))]

Related

How to select by non-direct child condition in Xpath?

I would like to show an example.
This how the page looks:
<a class="aclass">
<div class="divclass"></div>
<div id="innerclass">
<span class="spanclass">Hello</span>
</div>
</a>
<a class="aclass">
<div class="divclass"></div>
<div id="innerclass">
<span class="spanclass">Pick Delivery Location</span>
</div>
</a>
I want to select anchor tags that have a child (direct or non-direct) span that has the text 'Hello'.
Right now, I do something like this:
//a[#class='aclass'][div/span[text() = 'Hello']]
I want to be able to select without having to select direct children (div in this case), like this:
//a[#class='aclass'][//span[text() = 'Hello']]
However, the second one finds all the anchor tags with the class 'aclass' rather than the one with the span with 'Hello' text.
I hope I worded my question clearly. Please feel free to edit if necessary.
In your attempt, // goes back to the root of the document - effectively you are saying "Give me the as for which there is a span anywhere in the document", which is why you get them all.
What you need is the descendant axis :
//a[#class='aclass' and descendant::span[text() = 'Hello']]
Note I have joined the conditions with and, but two separate conditions would also work.

Getting single element with similar xpaths but with different same level, "neighboring" node

I'm trying to get the xpath of an element with a similar xpath to others but has a "neighbor" element that's different . Please see example below.
<div>
<div id='a'> </div>
<span> Text here </span> #this is what i'm trying to get
</div>
<div>
<div id='b'> </div>
<span> Text here </span>
</div>
I tried using //div//span, but this gives me the 2 spans. So i tried using //div//child::div[#id='a']//ancestor::div//child::span, but it doesn't look pleasant and looks repetitive. Does this have a better implementation?
try
//div[div[#id='a']]/span
it says get the span child node of all div nodes with child node div (with an #id equal to 'a').

Select all nodes between two elements excluding unnecessary element from the intersection using XPath

There’s a document structured as follows:
<div class="document">
<div class="title">
<AAA/>
</div class="title">
<div class="lead">
<BBB/>
</div class="lead">
<div class="photo">
<CCC/>
</div class="photo">
<div class="text">
<!-- tags in text sections can vary. they can be `div` or `p` or anything. -->
<DDD>
<EEE/>
<DDD/>
<CCC/>
<FFF/>
<FFF>
<GGG/>
</FFF>
</DDD>
</div class="text">
<div class="more_text">
<DDD>
<EEE/>
<DDD/>
<CCC/>
<FFF/>
<FFF>
<GGG/>
</FFF>
</DDD>
</div class="more_text">
<div class="other_stuff">
<DDD/>
</div class="other_stuff">
</div class="document">
The task is to grab all the elements between <div class="lead"> and <div class="other_stuff"> except the <div class="photo"> element.
The Kayessian method for node-set intersection $ns1[count(.|$ns2) = count($ns2)] works perfectly. After substituting $ns1 with //*[#class="lead"]/following::* and $ns2 with //*[#class="other_stuff"]/preceding::*,
the working code looks like this:
//*[#class="lead"]/following::*[count(. | //*[#class="other_stuff"]/preceding::*)
= count(//*[#class="other_stuff"]/preceding::*)]/text()
It selects everything between <div class="lead"> and <div class="other_stuff"> including the <div class="photo"> element. I tried several ways to insert not() selector in the formula itself
//*[#class="lead" and not(#class="photo ")]/following::*
//*[#class="lead"]/following::*[not(#class="photo ")]
//*[#class="lead"]/following::*[not(self::class="photo ")]
(the same things with /preceding::* part) but they don't work. It looks like this not() method is ignored – the <div class="photo"> element remains in the selection.
Question 1: How to exclude the unnecessary element from this intersection?
It’s not an option to select from <div class="photo"> element excluding it automatically because in other documents it can appear in any position or doesn't appear at all.
Question 2 (additional): Is it OK to use * after following:: and preceding:: in this case?
It initially selects everything up to the end and to the beginning of the whole document. Could it be better to specify the exact end point for the following:: and preceding:: ways? I tried //*[#class="lead"]/following::[#class="other_stuff"] but it doesn’t seem to work.
Question 1: How to exclude the unnecessary element from this intersection?
Adding another predicate, [not(self::div[#class='photo'])] in this case, to your working XPath should do. For this particular case, the entire XPath would look like this (formatted for readability) :
//*[#class="lead"]
/following::*[
count(. | //*[#class="other_stuff"]/preceding::*)
=
count(//*[#class="other_stuff"]/preceding::*)
][not(self::div[#class='photo'])]
/text()
Question 2 (additional): Is it OK to use * after following:: and preceding:: in this case?
I'm not sure if it would be 'better', what I can tell is following::[#class="other_stuff"] is invalid expression. You need to mention the element to which the predicate will be applied, for example, 'any element' following::*[#class="other_stuff"], or just 'div' following::div[#class="other_stuff"].

Page Object Gem. Making accessors for elements without ids

Lets say I have a simple page that has less IDs than I'd like for testing
<div class="__panel_body">
<div class="__panel_header">Real Estate Rating</div>
<div class="__panel_body">
<div class="__panel_header">Property Rating Info</div>
<a class="icon.edit"></a>
<a class="icon.edit"></a>
</div>
<div class="__panel_body">
<div class="__panel_header">General Risks</div>
<a class="icon.edit"></a>
<a class="icon.edit"></a>
</div>
<div class="__panel_body">
<div class="__panel_header">Amenities</div>
<a class="icon.edit"></a>
<a class="icon.edit"></a>
</div>
</div>
I'm using Jeff Morgan's Page Object gem and I want to make accessors for the edit links in any given section.
The challenge is that the panel headers differentiate what body I want to choose. Then I need to access the parent and get all links with class "icon.edit". Assume I can't change the HTML to solve this.
Here's a start
module RealEstateRatingPageFields
div(:general_risks_section, ....)
def general_risks_edit_links
general_risks_section_element.links(class: "icon.edit")
end
end
How do I get the general_risks_section accessor to work, though?
I want that to represent the parent div to the panel header with text 'General Risks'...
There are a number of ways to get the general risk section.
Using a Block
The accessors can take a block where you can more programatically describe how to locate the element. This allows you to locate a distinguishing element and then traverse the DOM to the element you actually want. In this case, you can locate the header with the matching text and navigate to its parent.
div(:general_risks_section) { div_element(class: '__panel_header', text: 'General Risks').parent }
Using XPath
While harder to read and write, you could also use an XPath locator. The concept and thought process is the same as using the block. The only benefit is that it reduces the number of element calls, which slightly improves performance.
div(:general_risks_section, xpath: './/div[#class="__panel_body"][./div[#class="__panel_header" and text() = "General Risks"]]')
The XPath is saying:
.//div # Find a div element that
[#class="__panel_body"] # Has the class "__panel_body" and
[./div[ # Contains a div element that
#class="__panel_header" and # Has the class "__panel_header" and
text() = "General Risks" # Has the text "General Risks"
]]
Using the Body Text
Given the HTML, you could also just locate the section directly based on its text.
div(:general_risks_section, class: '__panel_body', text: 'General Risks')
Note that this assumes that the HTML given was not simplified. If there are actually other text nodes, this probably would not be the best option.

Extracting contents from a list split across different divs

Consider the following html
<div id="relevantID">
<div class="column left">
<h1> Section-Header-1 </h1>
<ul>
<li>item1a</li>
<li>item1b</li>
<li>item1c</li>
<li>item1d</li>
</ul>
</div>
<div class="column">
<ul> <!-- Pay attention here -->
<li>item1e</li>
<li>item1f</li>
</ul>
<h1> Section-Header-2 </h1>
<ul>
<li>item2a</li>
<li>item2b</li>
<li>item2c</li>
<li>item2d</li>
</ul>
</div>
<div class="column right">
<h1> Section-Header-3 </h1>
<ul>
<li>item3a</li>
<li>item3b</li>
<li>item3c</li>
<li>item3d</li>
</ul>
</div>
</div>
My objective is to extract the items for each Section headers. However, inconveniently the designer of the webpage decided to break up the data into three columns, adding an additional div (with classes column right etc).
My current method of extraction was using the xpath
for section headers, I use the xpath (get all h1 elements withing a div with given id)
//div[#id="relevantID"]//h1
above returns a list of h1 elements, looping over each element I apply the additional selector, for each matched h1 element, look up the next ul node and retreive all its li nodes.
following-sibling::ul//li
But thanks to the designer's aesthetics, I am failing in the one particular case I've marked in the HTML file. Where the items are split across two different column divs.
I can probably bypass this problem by stripping out the column divs entirely, but I don't think modifying the html to make a selector match is considered good (I haven't seen it needed anywhere in the examples I've browsed so far).
What would be a good way to extract data that has been formatted like this? Full solutions are not neccessary, hints/tips will do. Thanks!
The columns do frustrate use of following-sibling:: and preceding-sibling::, but you could instead use the following:: and preceding:: axis if the columns at least keep the list items in proper document order. (That is indeed the case in your example.)
The following XPath will select all li items, regardless of column, occurring after the "Section-Header-1" h1 and before the "Section-Header-2" h1 header in document order:
//div[#id='relevantID']//li[normalize-space(preceding::h1) = 'Section-Header-1'
and normalize-space(following::h1) = 'Section-Header-2']
Specifically, it selects the following items from your example HTML:
<li>item1a</li>
<li>item1b</li>
<li>item1c</li>
<li>item1d</li>
<li>item1e</li>
<li>item1f</li>
You can combine following-sibling and preceding-sibling to get possible li elements in a div before the h2 and use the union operator |. As example for the second h2:
((//div[#id="relevantID"]//h1)[2]/preceding-sibling::ul//li) |
((//div[#id="relevantID"]//h1)[2]/following-sibling::ul//li)
Result:
<li>item1e</li>
<li>item1f</li>
<li>item2a</li>
<li>item2b</li>
<li>item2c</li>
<li>item2d</li>
As you're already selecting all h1 using //div[#id="relevantID"]//h1 and retrieving all li items for each h1 using as a second step following-sibling::ul//li, you could combine this to following-sibling::ul//li | preceding-sibling::ul//li.

Resources