python scrapy Xpath selecting text() not getting - xpath

I have this HTML code:
<div class="copy" style="padding:0px 45px 15px 30px;"><img src="/images/dot_clear.gif" height="1" width="623" border="0" class="">
<span class="copybold">
<span class="">viswa</span>
</span>
<div class="">
<span class="">first value</span><br class="">
<span class="">100-100-1000</span><br class="">
<span class="">2304 street</span><br class="">
<span class="">kannamapet</span>,
<span class="">TN</span>
<span class="">6000002</span><br class="">
</div>
Lic. Num:<span class="">234</span><br class="">
Lic. Year:<span class="">01/01/2001</span><br class="">
Lic. Expiration Year:<span class="">12/31/19</span><br class="">
Certificate approved .
<span class="copybold">
<span class="">Nathan</span>
</span>
<div class="">
<span class="">second value</span><br class="">
<span class="">200-200-2000</span><br class="">
<span class="">2367 street</span><br class="">
<span class="">kannamapet</span>,
<span class="">TN</span>
<span class="">6000002</span><br class="">
</div>
Lic. Num:<span class="">235</span><br class="">
Lic. Year:<span class="">01/01/2002</span><br class="">
Lic. Expiration Year:<span class="">12/31/2012</span><br class="">
Certificate approved .
I'm trying to get Lic. Num value = 234 and 235 and lic year but always showing null or invalid xpath,
I tried with:
//*[contains(text(), 'Num')]
But this is returning null.
Thanks for the help in advance

Use the following xpath.
//span[#class='copybold']/following-sibling::span[1]
Snapshot:

To get 234
//div[#class='copy']/div[#class=''][1]/following-sibling::span[#class=''][1]/text()
To get 235
//div[#class='copy']/div[#class=''][2]/following-sibling::span[#class=''][1]/text()

To get the license numbers:
//span[#class=''][preceding::text()[1][contains(.,'Lic. Num')]]/text()
Output:
234
235
To get the licence years:
//span[#class=''][preceding::text()[1][contains(.,'Lic. Year')]]/text()
Output:
01/01/2001
01/01/2002

Related

How to get href value in using HtmlAgilityPack?

How do I retrieve the values of HREF Tag using HTML AgilityPack?
<div class="row">
<div class="col-sm-24 businessCapsule--ctas">
<a href="http://www.xyz.coo.in" data-tracking="FLE:WL:CLOSED" class="businessCapsule--ctaItem" target="_blank" rel="nofollow noopener">
<div class="icon icon-Business-website" title="Website"></div> Website</a>
<div class="businessCapsule--telephone">
<div class="business--telephone business--telephone-noMarginRight">
<span class="icon icon-phone business--telephoneIcon"></span>
<div class="business--telephoneContent">
<span class="business--telephonePrefix">Tel</span>
<span class="business--telephoneNumber" itemprop="telephone">154 75 695 451 </span>
Try with a.Attributes.First().Value;
Where "a" is the HtmlNode that you want

Spring Thymeleaf SPEL - if null not wroking

i have the below code that everything runs fine except the check to see if the value is null. I know the value being returned is null, yet it still doesn't work. I've tried having == null both within and outside the {} to no avail.
Maybe it has something to do with how hibernate is returning the value? When i print out the object returned from the db it says null.
<div th:each="timecardLast: ${timecardLast}">
<a th:href="#{/timecardin}">
<div th:if="${timecardLast.status == null}" style="width: 100%" class="waves-effect card-panel green darken-1 z-depth-4">
<div class="card-content center-align">
<i class="medium material-icons white-text">timer</i>
<h5 class="white-text">SIGN IN TO WORK</h5>
</div>
</div>
</a>
<a th:href="#{/timecardin}">
<div th:if="${timecardLast.status} == 1" style="width: 100%" class="waves-effect card-panel green darken-1 z-depth-4">
<div class="card-content center-align">
<i class="medium material-icons white-text">timer</i>
<h5 class="white-text">SIGN IN TO WORK</h5>
</div>
</div>
</a>
<a th:href="#{/timecradin}">
<div th:if="${timecardLast.status} == 2" style="width: 100%" class="waves-effect card-panel deep-orange darken-2 z-depth-4">
<div class="card-content center-align">
<i class="medium material-icons white-text">timer</i>
<h5 class="white-text">SIGN IN TO WORK</h5>
</div>
</div>
</a>
<a th:href="#{/timecardout}">
<div th:if="${timecardLast.status} == 0" style="width: 100%" class="waves-effect card-panel deep-orange darken-2 z-depth-4">
<div class="card-content center-align">
<i class="medium material-icons white-text">timer_off</i>
<h5 class="white-text">SIGN OUT OF WORK</h5>
</div>
</div>
</a>
</div>
The syntax for lists in Thymeleaf is naming the element and naming the list. You have both names for the same thing. So you may want instead:
<div th:each="timecard: ${timecardList}">
...
<div th:if="${timecard.status == null}"...>
...
After adding a list of objects called timecardList to the model.
Took a step back and looked at how i was creating the object. I was not initializing the object first i.e.
Timecard timecardLast = timecardService.getLastTimecardByIdusers(staff);
So, a simple initialization of the object properly did the job followed by the db request i.e.
timecardLast = new Timecard();
timecardLast = timecardService.getLastTimecardByIdusers(staff);

The result of an xpath is {{price}} in Scrapy

When I try to scrape price list the value that exists is a variable. You could help me with this please.
This is the source code:
<div id="template-precio" type="text/x-handlebars-template" style="display:none">
{{#if potentialSavedPrice}}
<span class="saving">Ahorra <b>${{potentialSavedPrice}}</b></span>
<span class="list">Precio lista: <i>${{price}}</i></span>
<span class="lable">Precio web:</span>
{{#if eCoupon}}
<span class="price">${{potentialCouponPrice}}</span>
<span class="coupon">Usa el cupón <b>{{eCoupon}}</b></span>
{{else}}
<span class="price">${{potentialDiscountedPrice}}</span>
{{/if}}
{{else}}
<span class="lable">Precio web:</span>
<span class="price">${{price}}</span>
{{/if}}
My xpath:
response.xpath('.//*[#class="list"]/i/text()').extract_first()
Result:
'${{price}}'
This is the code in the chrome inspector:
<div class="details-price">
<div class="">
<span class="saving">Ahorra <b>$369.943,87</b></span>
<span class="list">Precio lista: <i>$3.699.438,68</i></span>
<span class="lable">Precio web:</span>
<span class="price">$3.329.494,81</span>
<span class="coupon">Usa el cupón <b>B2BUSINESS</b></span>
This is the page:
http://www3.lenovo.com/co/es/ofertas-y-cupones
Thank you

Create Xpath Manually

I want to create xpath which must contain word "Asia"
//div[#id='destination-loadLevel0']/div/ul/li/div/div/span
*for Asia
//div[#id='destination-loadLevel0']/div/ul/li[2]/div/div/span
*for Europe
//div[#id='destination-loadLevel0']/div/ul/li[3]/div/div/span
*for USA
//<div class="col-md-3 level-column column-viewport" id="destination-loadLevel0">
<div class="row">
<ul>
<li class="col-md-3 col-hotspot unselectable-text" data-parent-group-id="" data-groupid="2244604" data-name="Asia;|03|00|00|">
<div class="tile-container">
<div class="image-label" data-toggle="tooltip" data-container="body" data-placement="top" title="" data-original-title="Asia">
<span class="title ellipsis">Asia</span>
<span class="icon icon-angle-right"></span>
</div>
//*[contains(text(),'Asia')] | //#*[contains(.,'Asia')]/parent::*
returns all elements whose text contains 'Asia' or whose attribute values contains 'Asia'
Output :
<li class="col-md-3 col-hotspot unselectable-text" data-parent-group-id="" data-groupid="2244604" data-name="Asia;|03|00|00|" />
<div class="image-label" data-toggle="tooltip" data-container="body" data-placement="top" title="" data-original-title="Asia" />
<span class="title ellipsis">Asia</span>

xpath specific selection with condition

this might be simple, but I would like to select everything within <div class="rc-box-citations-body"> under the condition that it must belong to <div class="definitionBox" id="meaning-1-1">, thereby uniquely identifying it. How can I do that with xpath? Thanks.
<div class="definitionIndent">
<div class="definitionNumber">1.a</div>
<div class="definitionIndent">
<div class="definitionBox" id="meaning-1-1">
<span class="textmedium">
<span class="stampNoBorder">text</span>
<span class="definition">text</span>
</span>
</div>
<div class="definitionBox">
<div class="rc-box-citations">
<div class="rc-box-citations-top">
<span class="rc-citations-north-west"> </span>
<span class="rc-citations-north-east"> </span>
</div>
<div class="rc-box-citations-body"><span class="citat">text</span> <a class="sourcepop" href="javascript:void(0);"><span class="source">text</span><span class="popup">text</span></a></div>
<div class="rc-box-citations-bot">
<span class="rc-citations-south-west"> </span>
<span class="rc-citations-south-east"> </span>
</div>
</div>
</div>
</div>
</div>
If I modify your xml slightly, and take under the condition that it must belong to to mean that is a descendant of.... then this xpath works
//div[#class='definitionBox'][#id='meaning-1-2']//div[#class='rc-box-citations-body']
The XML is
<?xml version="1.0" encoding="utf-16"?>
<div class="definitionIndent">
<div class="definitionNumber">1.a</div>
<div class="definitionIndent">
<div class="definitionBox" id="meaning-1-1">
<span class="textmedium">
<span class="stampNoBorder">text</span>
<span class="definition">text</span>
</span>
</div>
<div class="definitionBox" id="meaning-1-2">
<div class="rc-box-citations">
<div class="rc-box-citations-top">
<span class="rc-citations-north-west"></span>
<span class="rc-citations-north-east"></span>
</div>
<div class="rc-box-citations-body">
<span class="citation">text</span>
<a class="sourcepop" href="javascript:void(0);">
<span class="source">text</span>
<span class="popup">text</span>
</a>
</div>
<div class="rc-box-citations-bot">
<span class="rc-citations-south-west"></span>
<span class="rc-citations-south-east"></span>
</div>
</div>
</div>
</div>
</div>
The tool I used is XPathVisualizer:

Resources