I want to Select all the LI elements which contain SPAN with id="liveDeal152_dealPrice" as descendents. How do i do this with xpath?
Here is a sample html
<ul>
<li id="liveDeal_152">
<p class="price">
<em>|
<span class="WebRupee">₹ </span>
<span id="liveDeal152_dealPrice">495 </span>
</p>
</li>
<li id="liveDeal_152">
<p class="price">
<em>|
<span class="WebRupee">₹ </span>
(price hidden)
</p>
</li>
</ul>
//li[.//span[#id = 'liveDeal152_dealPrice']] should do. Or more verbose but closer to your textual description //li[descendant::span[#id = 'liveDeal152_dealPrice']].
Use this
//li[.//span[#id="liveDeal152_dealPrice"]]
It selects
ALL <li> ELEMENTS
//li[ ]
THAT HAVE A <span> DESCENDANT
.//span[ ]
WITH id ATTRIBUTE EQUAL TO "liveDeal152_dealPrice"
#id="liveDeal152_dealPrice"
That said, it doesn't seem like a very wise element selection, mostly due to the dynamically looking id. If you're going to use it once, it's probably ok, but if you're using it, say, for testing and will reuse it many times, it might cause trouble. Are you sure this won't change when you change your website and/or database?
As a side note:
ul stands for "unordered list"
ol stands for "ordered list"
li stands for "list item"
Related
I am trying to scrape a web page for NAME OF COMPANY and CITY AND STATE OF COMPANY shown below.
I have an xpath code snippet that identifies both text elements at the same time:
// span[starts-with(#class,"text-align")]/text()[2]
This xpath snippet pulls the first text value (COMPANY NAME). How do I get the second text element (CITY,STATE)?
A snip of the web page code looks like this:
<div>
<ul class="pv-top-card-v3--experience-list">
<li>
<a class="pv-top-card-v3--experience-list-item" href="#" data-control-name="position_see_more" data-ember-action="" data-ember-action-172="172">
<img src="https://media.licdn.com/dms/image/C4E0BAQFhA8h46hvabA/company-logo_100_100/0?e=1582761600&v=beta&t=VAeZqaGu3Lu6Ol_n5kiiI74FSRuSOZA1ggAI5qTVRjE" id="ember173" class="EntityPhoto-square-1 flex-shrink-zero ember-view">
<span id="ember174" class="text-align-left ml2 t-14 t-black t-bold full-width lt-line-clamp lt-line-clamp--multi-line ember-view" style="-webkit-line-clamp: 2"> THIS IS THE NAME OF A COMPANY
<!----></span>
</a>
</li>
<li>
<a class="pv-top-card-v3--experience-list-item" href="#" data-control-name="education_see_more" data-ember-action="" data-ember-action-176="176">
<img src="https://media.licdn.com/dms/image/C560BAQEr2uQX-x2EwQ/company-logo_100_100/0?e=1582761600&v=beta&t=aDbYLUDMvlS4DpwOLjOaQj3Dj60C_cYLC5UUvGoyld0" id="ember177" class="EntityPhoto-square-1 flex-shrink-zero ember-view">
<span id="ember178" class="text-align-left ml2 t-14 t-black t-bold full-width lt-line-clamp lt-line-clamp--multi-line ember-view" style="-webkit-line-clamp: 2"> THIS IS THE CITY AND STATE OF COMPANY
<!----></span>
</a>
</li>
</ul>
</div>
The xpath string is picking up the two span elements using class. I can't use the span id attributes because they are dynamic and change with each page (one page per company).
Can someone advise how I extract the desired text?
Thanks.
point to the li level.
//ul/li[2]/a/span[starts-with(#class,"text-align")]
I have noticed that using xpath axes methods sometimes return wrong nodes. I have two examples:
url: "http://demo.guru99.com/v1/"
<tr>
<td align="center">
<img src="../images/1.gif">
<img src="../images/3.gif">
<img src="../images/2.gif">
</td>
</tr>
I can select three img elements by axes methods "//td//child::img". However when I use "//td//following-sibling::img", it can still return the second and third img elements. As far as I know, child and sibling are two different thing, so why this happens?
url: http://demo.guru99.com/selenium/guru99home/
<div class="rt-grid-12 rt-alpha rt-omega" id="rt-feature">
<div class="rt-grid-6 ">
<div class="rt-block">
<h3>
Desktop, mobile, and tablet access</h3>
<ul>
<li>
<p>
Free android App</p>
</li>
<li>
<p>
Download any tutorial for free</p>
</li>
<li>
<p>
Watch video tutorials from anywhere </p>
</li>
</ul>
<p>
<img alt="" src="images/app_google_play(1).png"></p>
</div>
</div>
<div class="rt-grid-5 ">
<div class="rt-block">
<img src="images/logo_respnsivsite.png"><br>
</div>
</div>
</div>
Here, if I use "//div[#id='rt-feature' and (#class='rt-grid-12 rt-alpha rt-omega')]//following-sibling::div", those div elements which should be child elements are still be counted as siblings
Use "//div[#id='rt-feature' and (#class='rt-grid-12 rt-alpha rt-omega')]//parent::div", the self element and its child div elements are all counted as parent.
This cause me a lot of confusion, please help me.
Suggesting that the XPath parser returns the wrong nodes, rather than that you don't understand why it is returning what it does, is starting from the wrong mindset. Unless you know the XPath parser is unreliable, start with the assumption that it is right and your expectations are wrong. Then go to the spec and study the semantics of the expression you have written.
You will find that
//td//following-sibling::img
is an abbreviation for
/descendant-or-self::node()/td/descendant-or-self::node()/following-sibling::img
so you have asked for all the following siblings of all the descendants of all the td nodes, which is exactly what you are getting.
I've come across people who habitually write "//" in place of "/" as a sort of magic fairy dust without having the faintest idea what it means. Don't do it: read the spec.
I am trying to select an item from a dropdown list in robot framework (using RIDE), but I cannot get the item by variable name.
<div class="chosen-drop">
<div class="chosen-search">
<input type="text" autocomplete="off">
</div>
<ul class="chosen-results">
<li class="active-result" data-option-array-index="0">Geen optie gekozen</li>
<li class="active-result" data-option-array-index="2">ABB</li>
<li class="active-result" data-option-array-index="3">Algem</li>
<li class="active-result" data-option-array-index="4">AOV</li>
<li class="active-result" data-option-array-index="5">AW</li>
<li class="active-result" data-option-array-index="8">AOZ</li>
</ul>
</div>
I can use this and get the result:
Click Element xpath=//*[#id="KEUZE_N_MiddelId_N1010D_chosen"]
Click Element xpath=//*
[#id="KEUZE_N_MiddelId_N1010D_chosen"]/div/ul/li[4]
But the index number can change, so I want to click the element based on the value, in this example 'ABB'. How can I achieve this?
You can Try the following:
Select From List By Label| css=ul.chosen-results| ABB
It is very similar to this SO post but not exact enough to be considered a duplicate. Based on your already achieved results I think this should work for you.
[#id="KEUZE_N_MiddelId_N1010D_chosen"]/div/ul/li[text() = 'ABB']
I'm trying to scrape a site using a highly varying HTML structure. The information at interest is not encapsulated. The only marker is a span with a target id TARGETID.
Structure is:
<h2>
<span class="TARGETID">TARGETID</span>
</h2>
<p> <!-- this is not always present, could be more p tags --> </p>
<ul> <!-- also not always present, if there, this is what we want --> </ul>
<h2>
<span class="SOMEIRRELEVANTID">IRRELEVANT</span>
</h2>
My approach was:
//h2/span[contains(text(), 'TARGETID')]/../following-sibling::ul[1][count(li) > 1][li]//a/text()
Which succeeds when a unordered list is present after the TARGETID, but if not, it takes the next unordered list it finds (which makes sense based on the query).
My question is: How can I limit the query to the nodes of two H2's, starting with the one containing a span with the target id and limited by any following H2 with a span of a different id?
Any hints are greatly appreciated.
This XPath,
//ul[preceding::h2[1][.='TARGETID']]//a
will select all a elements beneath a ul that occurs after a h2 with string value of "TARGETID" but before any other h2 elements.
So, for this expanded example,
<div>
<h2>
<span class="TARGETID">TARGETID</span>
</h2>
<p> <!-- this is not always present, could be more p tags --> </p>
<ul> link1 </ul>
<h2>
<span class="SOMEIRRELEVANTID">IRRELEVANT</span>
</h2>
<ul> link2 </ul>
<h2>
<span class="SOMEIRRELEVANTID">IRRELEVANT</span>
</h2>
</div>
it would select only
link1
and not link2, as requested.
I am trying to get the error message off of a page from a site. The list contains several possible errors so i can't check by id; but I do know that the one with display:list-item is the one I want. This is my rule but doesn't seem to work, what is wrong with it? What I want returned is the error text in the element.
//*[#id='errors']/ul/li[contains(#style,'display:list-item')]
Example dom elements:
<div id="errors" class="some class" style="display: block;">
<div class="some other class"></div>
<div class="some other class 2">
<span class="displayError">Please correct the errors listed in red below:</span>
<ul>
<li style="display:none;" id="invalidId">Enter a valid id</li>
<li style="display:list-item;" id="genericError">Something bad happened</li>
<li style="display:none;" id="somethingBlah" ............ </li>
....
</ul>
</div>
The correct XPath should be:
//*[#id='errors']//ul/li[contains(#style,'display:list-item')]
After //*[#id='errors'] you need an extra /, because <ul> is not directly beneath it. Using // again scans all underlying elements for <ul>.
If you are capable to not use // it would be better and faster and less consuming.