I need to write a dynamic XPath for below scenario

I need to write a dynamic XPath for below scenario - xpath

HTML Code:
<font size="6.2em;" color="red"> $0.00</font>
I need to print the price every time I add more items.
The XPath that I've tried is
*//font[starts-with(normalize-space()='$')]* but it is not locating price tag element.
this is the URL : http://demo.guru99.com/payment-gateway/process_purchasetoy.php

Related

XPath one of multiple elements of an attribute

in this HTML using scrapy i can access the full info-car by : './/#info-car' XPath
<div class="car car-root"
info-car='{brand":"BMW","Price":"&#30000"name":"X5","color":null,"}'>
</div>
what is the XPath to pick only the name of info-car ?

You can obtain the name by using a combination of xpath and regex. See below sample code:
response.xpath(".//#info-car").re_first(r'"name":"(.*)",')

How to scrape data using Nokogiri from elements having two 'data-' attributes

I want to scrape data using Nokogiri from some HTML:
<td data-bar="hoge" data-date="2000-01-01" class="modals"></td>
<td data-bar="fuga" data-date="2000-01-02" class="modals"></td>
I wrote:
element = page.css("td[data-bar='hoge'][data-date='2000-01-01']")
but element.length returns 0.
How do I distinguish elements having two data- attributes?

Try using XPath selectors instead. This worked for me:
element = page.xpath "//td[#data-bar='hoge'][#data-date='2000-01-01']"
In this example, the // portion will match any td element (with those attributes) in the document, which may not be desirable. In that case, you would need to write a more explicit XPath to the node.
Here's the documentation for XPath: https://www.w3.org/TR/xpath/

Xpath - how to extract data using class target

Code snippet:
<td class="right odds down"><a class=" betslip" target="unibet" onmouseout="delayHideTip()" onmouseover="page.hist(this,'P-0.00-0-0','24vekxv464x0x4g25d',5,event,1,1)" href="/bookmaker/unibet/betslip//event/1002752206/coupon/single,2133228960,p,[0]">1.70</a></td>
I trying to extract data from a page where the class target is "Unibet".
What would be the correct formatting for this query?
Ive tried:
//*[classtarget="unibet"]//td/a/#class

Well, target is attribute, not class, of element <a>. The XPath to find <td> element and then return the child element <a> where target attribute value equals "unibet" will be :
//td/a[#target='unibet']
if you want to return class attribute of the <a> element instead, simply add a trailing /#class to the above XPath :
//td/a[#target='unibet']/#class

Xpath get text of nested item not working but css does

I'm making a crawler with Scrapy and wondering why my xpath doesn't work when my CSS selector does? I want to get the number of commits from this html:
<li class="commits">
<a data-pjax="" href="/samthomson/flot/commits/master">
<span class="octicon octicon-history"></span>
<span class="num text-emphasized">
521
</span>
commits
</a>
</li
Xpath:
response.xpath('//li[#class="commits"]//a//span[#class="text-emphasized"]//text()').extract()
CSS:
response.css('li.commits a span.text-emphasized').css('::text').extract()
CSS returns the number (unescaped), but XPath returns nothing. Am I using the // for nested elements correctly?

You're not matching all values in the class attribute of the span tag, so use the contains function to check if only text-emphasized is present:
response.xpath('//li[#class="commits"]//a//span[contains(#class, "text-emphasized")]//text()')[0].strip()
Otherwise also include num:
response.xpath('//li[#class="commits"]//a//span[#class="num text-emphasized"]//text()')[0].strip()
Also, I use [0] to retrieve the first element returned by XPath and strip() to remove all whitespace, resulting in just the number.

Parsing inner tags using Nokogiri

I'm stuck not being able to parse irregularly embedded html tags. Is there a way to remove all html tags from a node and retain all text?
I'm using the code:
rows = doc.search('//table[#id="table_1"]/tbody/tr')
details = rows.collect do |row|
detail = {}
[
[:word, 'td[1]/text()'],
[:meaning, 'td[6]/font'],
].collect do |name, xpath|
detail[name] = row.at_xpath(xpath).to_s.strip
end
detail
end
Using Xpath:
[:meaning, 'td[6]/font']
generates
:meaning: ! '<font size="3">asking for information specifying <font
color="#CC0000" size="3">what is your name?</font> /what/ as in, <font color="#CC0000" size="3">I'm not sure what you mean</font>
/what/ as in <a style="text-decoration: none;" href="http://somesecretlink.com">what</a></font>
On the other hand, using Xpath:
'td/font/text()'
generates
:meaning: asking for information specifying
thus ignoring all children of the node. What I want to achieve is this
:meaning: asking for information specifying what is your name? /what/ as in, I'm not sure what you mean /what/ as in what? I can't hear you

This depends on what you need to extract. If you want all text in font elements, you can do it with the following xpath:
'td/font//text()'
It extracts all text nodes in font tags. If you want all text nodes in the cell, then:
'td//text()'
You can also call the text method on a Nokogiri node:
row.at_xpath(xpath).text

I added an answer for this same sort of question the other day. It's a very easy process.
Take a look at: Convert HTML to plain text and maintain structure/formatting, with ruby

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

I need to write a dynamic XPath for below scenario - xpath

Related

XPath one of multiple elements of an attribute

How to scrape data using Nokogiri from elements having two 'data-' attributes

Xpath - how to extract data using class target

Xpath get text of nested item not working but css does

Parsing inner tags using Nokogiri

Categories

Resources