XPath: Select any div that contains one or more descendant divs with a specific class - xpath

Assume that following HTML snippet exists somewhere in the <body> element of a web page:
<div id="root_1000" class="root bacon">
<ul>
<li id="item_1234567" class="active">
<div class="userpost author_4281">
<div>This text should be visible.<div>
</div>
<ul><li>Some item</li></ul>
</li>
</ul>
</div>
<div id="root_2000" class="root bacon">
<ul>
<li id="item_8675309" class="active">
<div class="userpost author_3333">
<div>
This text, and as the DIV.root that contains it, should be hidden.
<div>
</div>
<ul><li>Another item</li></ul>
</li>
</ul>
</div>
<div id="root_3000" class="root bacon">
<ul>
<li id="item_7654321" class="active">
<div class="userpost author_9877">
<div>This text should be visible.<div>
</div>
<ul><li>Yet another item</li></ul>
</li>
</ul>
</div>
So here's my question: what would the XPath syntax be to select the div.root that contains info posted by author #3333 (i.e. div[class~="author_3333"])?
The following XPath statement will properly match the div.userpost element associated with author #3333 that I want to hide, but does not include the <ul><li>Another item</li></ul> node, which I also need to hide:
.//div[contains(#class, 'author_3333')]
What I want to do is select the closest div.root ancestor associated with the node that my XPath statement matches. Any help would be greatly appreciated... thanks in advance!

you need to get the parent node that has the second div as its child, something like:
//div[.//div[contains(#class, "author_3333")]]

You can use this XPath expression:
.//div[contains(#class, 'author_3333')]/ancestor::div[contains(#class,'root')][1]
Output is:
<div id="root_2000" class="root bacon">
<ul>
<li id="item_8675309" class="active">
<div class="userpost author_3333">
<div>
This text, and as the DIV.root that contains it, should be hidden.
</div>
</div>
<ul>
<li>Another item</li>
</ul>
</li>
</ul>
</div>

Related

Scrapy xpath select parent element based on text value in subelement and lacking of element

I want to select all elements article that don't contain a span element with class status and where the nested a element contains a href attribute which contains the text "rent.html".
I've managed to get the a element like so:
response.xpath('//article[#class="car"]//a[contains(#href,"rent.html")]')
But reading here and trying to select the first parent element article like so returns "data=0"
response.xpath('//article[#class="car"]//a[contains(#href,"rent.html")]//parent::article and not //article[#class="car"]//span[#class="status"]')
I also tried this.
response.xpath('//article[#class="car"][//a[contains(#href,"rent.html")]/article and not //article[#class="car"]//span[#class="status"]')')
I don't know what the expression is for my use case.
<article class="car">
<div>
<div class="container">
<a href="/34625030/rent.html">
</a>
</div>
</div>
</article>
<article class="car">
<div>
<div class="container">
<a href="/34625230/rent.html">
</a>
</div>
</div>
</article>
<article class="car">
<div>
<div class="container">
<a href="/12325230/buy.html">
</a>
</div>
</div>
</article>
<article class="car">
<div>
<div class="container">
<a href="/34632230/rent.html">
</a>
</div>
</div>
<span class="status">Rented</span>
</article>
This XPath expression will do the work:
"//article[not(.//span[#class='status'])][.//a[contains(#href,'rent.html')]]"
The entire command is:
response.xpath("//article[not(.//span[#class='status'])][.//a[contains(#href,'rent.html')]]")
Explanations:
Translating your requirements into XPath syntax.
"select all elements article" - //article
"that don't contain a span element with class status" - [not(.//span[#class='status'])]
" and where the nested a element contains a href attribute which contains the text "rent.html"" - [.//a[contains(#href,'rent.html')]]
I tested the XPath above on the shared sample XML and it worked properly.

How to find xpath in label class?

Can someone assist, I need to identify the xpath that's inside label class for the last ticket-type. The .html code is as below.
<div class="vertical-accordion">
<ul id="accordion-5" class="accordion">
<li class="open-li">
<a class="toggle-link open" href="#">
<span>Tickets on your smart card</span>
</a>
<div class="accordion-drop" style="display: block;">
<ul>
<li>
<li>
<li>
<label class="ticket-type">megarider</label>
<label class="validity"> (7 day: 16 June to 22 June 2016 ) </label>
</li>
</ul>
</div>
</li>
</ul>
</div>
</div>
</div>
</div>
</div>
<script type="text/javascript">
</div>
</div>
</div>
</article>
If you want to find the last ticket-type, I assume that there are multiple ticket type...
You didn't provide a lot of HTML, I'll answer with that assumption :
//label[#class='ticket-type'][last()]
Hope it'll help !
You can use this xpath to find the element in your example:
//label[#class='ticket-type'][1]
Open up this page using Chrome, Hit F12 Developer tools, Inspect the element, find the label in the F12 window (Elements Tab) right click on the label tag and select Copy -> Copy XPath

reject li dom element having specific attributes

I am trying to get scrape a page and get dom elements which is a collection on links with Ruby and Nokogiri. So I have a collection of li's which has a specific attributes in some li's. I need to reject those li;s which has specific attributes and get all the link tags of those li's.
Here is my DOM looks like.
<ul>
<li class="carousel-list-item">
<a itemprop="url" data-cr="CharNav23" class="property-icon property-icon-14" href="/max-and-shred/">
<div itemprop="name" class="property-tooltip">
Max & Shred
</div>
</a>
</li>
<li class="carousel-list-item">
<a itemprop="url" data-cr="CharNav24" class="property-icon property-icon-19" href="/rabbids-invasion/">
<div itemprop="name" class="property-tooltip">
Rabbids Invasion
</div>
</a>
</li>
<li data-sponsor="Sponsor" class="carousel-list-item">
<a itemprop="url" data-cr="CharNav21" class="property-icon property-icon-40" target="_blank" href="http://pubads.g.doubleclick.net/gampad/clk?id=47616903&iu=8675">
<div itemprop="name" class="property-tooltip">
LEGO Friends
</div>
</a>
</li>
<li class="carousel-list-item">
<a itemprop="url" data-cr="CharNav24" class="property-icon property-icon-19" href="/rubyds-investment/">
<div itemprop="name" class="property-tooltip">
Rabbids Invasion
</div>
</a>
</li>
</ul>
I need to collect all a tags whose lis dont have data-sponsor="Sponsor" attributes. I tried like the below but it includes all lis.
page.search('ul.carousel-list > li > a').map{ |link| make_absolute(link['href']) }
The css way to do that is:
page.search('li:not([data-sponsor]) a')
or
page.search('li:not([data-sponsor=Sponsor]) a')
Probably a better option than xpath.
You should try:
# this will give you all ul elements which has no attribute named 'data-sponsor'.
page.search('//ul[#class="carousel-list"]/li[not(#data-sponsor)]/a').map{ |link| make_absolute(link['href']) }

How to fill my html with json object?

I am new in ajax.
I am trying to get value and want to fill in html code snippet.
I have html code and json object that has value.
Now I want to show the specific value in the different-different part of html code.
Here is my html code:-
<div>
<div class="borb clearfix">
<div class="profileholder fleft">
<img src="images/users/1.png" class="userpic">
<div class="icon state green"></div>
</div>
<div class="remainder">
<div class="padl10">
<div class="username">Anurag Shivpuri</div>
<div class="desig">Cheif Information Officer</div>
<div class="loc">Credit Operation | Pune</div>
</div>
</div>
</div>
<ul class="userdata">
<li>
<span class="lbl">Employee Code :</span>
<span> 2007</span>
</li>
<li>
<span class="lbl">Role_Designation :</span>
<span> Senior HR</span>
</li>
<li>
<span class="lbl">Department :</span>
<span> HR</span>
</li>
<li>
<span class="lbl">Sub_Department :</span>
<span> Talent Acquisition</span>
</li>
<li>
<span class="lbl">Official E-mail Id :</span>
<span> atul.gupta#bajajfinserve.co.in</span>
</li>
<li>
<span class="lbl">Mobile No :</span>
<span> 9844333932</span>
</li>
</ul>
<div class="footbar">
Reward
Incentive
Movement
Leaves
LnD
</div>
On success of ajax I am getting the value now I want to fill it in my html code.
Please help me.
You can use a library (like knockout) to do that, or you can use jQuery to create the elements:
$("<div>").append("some text").appendTo("body"); //creates a div, append some text and...
If you already have a success event in your ajax, you could grab an empty DOM element and set its content.
document.getElementById('placeholder').innerHTML = jsonData.myProperty;
You should use innerHTML in javascript.
Set some ID on the element you need to update:
<li>
<span class="lbl">Employee Code :</span>
<span id='employeeCode'> 2007</span>
</li>
Then update it with your json value:
document.getElementById('employeeCode').innerHTML= yourJsonObject.employeeCodeValue;

xpath syntax for selecting the text after <strong> tag

<div class="article_details">
<h1>Product name is</h1>
<div class="left">
<ul class="article_list">
<li>
<strong>art. nr.:</strong>
VS7896
</li>
<li>
<b>Shipping time</b>
: 1-3 Days
</li>
</ul>
</div>
I used //DIV[#class='left']/UL[1]/LI[1] but the result is "art. nr.: VS7896".
Please help me with the correct XPath to select just "VS7896".
To select the text after <strong>, use
//strong/following-sibling::text()[1]

Resources