Web Scraping - xPath issue - xpath

I need to extract the text 120 from this HTML code:
<section class="details">
<h2>Détails du bien</h2>
<table>
....
<tr>
<td>Surface habitable (m²)</td>
<td class="right" title="120">120 </td>
</tr>
...
</table>
</section>
I used this xpath, but it returns an empty list:
//td[contains(text(),"Surface")]/td[#class="right"]/text()
What am I doing wrong?

Try to use xPath axes:
//td[contains(text(),"Surface")]/following-sibling::td[#class="right"]/text()
This should solve your problem.

Related

XPath find text according last word in the string

I need to find the whole text according last word in the string. I have something like this:
<table>
<tr>
<td style='white-space:nowrap;'>
<a href=''>test</a>
</td>
<td>any text</td>
<td>text text texttofind</td>
<td>Not Available</td>
<td class='aui-lozenge aui-lozenge-default'>text</td>
</tr>
<tr>
<td style='white-space:nowrap;'>
<a href=''>test</a>
</td>
<td>any text</td>
<td>text text texttofind2</td>
<td>Not Available</td>
<td class='aui-lozenge aui-lozenge-default'>text</td>
</tr>
<tr>
<td style='white-space:nowrap;'>
<a href=''>test</a>
</td>
<td>any text</td>
<td>text text texttofind3</td>
<td>Not Available</td>
<td class='aui-lozenge aui-lozenge-default'>text</td>
</tr>
</table>
I need to find whole text vallue according last word texttofind
<td>text text texttofind</td>
I cant use contains, because it will find multiple values. I need something like ends-with but I am using xpath 1.0.
I tried something like this, but I am not sure what is wrong because it is not working
//tr[substring(., string-length(#td)
- string-length('texttofind') + 1) = 'texttofind']
or maybe it would be better to use matches?
You're almost there; try changing your xpath expression to
//tr//td[substring(., string-length(.)
- string-length('texttofind') + 1) = 'texttofind']
and see if it works.

How can I load this "label" using importxml & xPath

I've been trying to get this to load into google spreadsheet, but no success so far:
http://www.aastocks.com/tc/stocks/analysis/company-fundamental/basic-information?symbol=00027
The, I have the formula as follows:
=importxml ("http://www.aastocks.com/tc/stocks/analysis/company-fundamental/basic-information?symbol=" & To_Text(A7),"//td[#id='sb2-last']/label[#id='SQ_Last']/following-sibling::label/text()")
Result:
The content imported is empty.
The html part:
<table>
<tbody>
<tr>
<td>現價<label id="SQ_Currency">(港元)</label></td>
<td id="sb2-last">
<label id="SQ_Last" class="cls">**57.500**</label>
</td>
</tr>
</tbody>
</table>
try:
=INDEX(IMPORTXML(
"http://www.aastocks.com/tc/stocks/analysis/company-fundamental/basic-information?symbol="&
TO_TEXT(A7), "//td[#class='mcFont cls']"), 26)

Nokogiri and tables

Am parsing a web page with a standard structure as follows:
<html>
<body>
<table>
<tbody>
<tr class="active">
<td>name1</td>
<td>name2</td>
<td>name3</td>
</tr>
</tbody>
</table>
</body>
</html>
For the life of me, I can't access the 'tbody' or 'tr' elements.
response = open('http://my_url')
node = Nokogiri::HTML(response).css('table')
puts node
Returns
#<Nokogiri::XML::Element:0x8294c08c name="table" attributes=[#<Nokogiri::XML::Attr:0x8294c014 name="id" value="beta-users">] children=[#<Nokogiri::XML::Text:0x82953bc0 "\n">]>
I have tried various tricks but can't seem to dig deeper down to a lower-level child than 'table'.
At best, I can get to the lowest-level Text object by using
node.children
but
node.children.text
returns "\n".
Despite searching for some hours am none the wiser how to sort it out. Any thoughts?
There is a non-closed class value in your sample, it should be:
<html>
<body>
<table>
<tbody>
<tr class="active">
<td>name1</td>
<td>name2</td>
<td>name3</td>
</tr>
</tbody>
</table>
</body>
</html>
After correcting this, you can:
node = Nokogiri::HTML(response).css('table tbody tr td')
node.each {|child| puts child.text}
name1
name2
name3

selenium webdriver xpath generation

I want to create an xpath for clicking on "run " (4th column) based on the first column value (xyz). the below xpath doesnt work. Can you suggest a better way of writing the xpath.
//table/tbody/tr/td[text()='xyz fix']/parent::tr/td[4]
<div id="main">
<table class="FixedLayout" width="1000px">
<tbody>
<tr></tr>
<tr>
<td class="RowHeight">
xyz
</td>
<td>xyz fix</td>
<td>1125</td>
<td>
Run
</td>
</tr>
<tr>
<td class="RowHeight">
abc
</td>
<td>abc fix</td>
<td>1125</td>
<td>
Run
</td>
</tr>
</tbody>
</table>
</div>
I don't see why your one didn't work. Please clarify what it means "doesn't work". NoSuchElementException? ElementNotVisibleException? Wrong XPath? Not clicking the link or what?
Meanwhile, try the following XPaths (but the issue could be your Selenium code instead of XPath):
Here I assume you want to the <a> link instead of <td>, because you mentioned you want to click it.
Use XPath predicate:
//*[#id='main']//table/tobdy/tr[td[text()='xyz']]/td[4]/a
Use XPath predicate with attribute selector to avoid using index.
//*[#id='main']//table/tobdy/tr[td[text()='xyz']]//a[contains(#href, 'Instance/Create')]
Use .. to get the parent
//*[#id='main']//table/tobdy/tr/td[text()='xyz']/../td[4]/a

knockout 2.0 doesn't work in IE8

This view works well with IE9 and Chrome. However, not with IE8.
When the page is rendered, this is how it looks like:
My HTML (MVC3 View) is as shown below.
<div id="machinedisplay" data-bind="with: selectedMachine" >
<h2><span data-bind="text: MachineDesciption" /></h2>
<!-- ko with: my.vm.machineData -->
<table>
<thead><tr>
<th>Point Name</th><th>Description</th><th>Points Data</th>
</tr></thead>
<tbody data-bind="foreach: Points">
<tr>
<td data-bind="text: PointName()"></td>
<td data-bind="text: PointDesciption()"></td>
<td>
<table style="width:100%;">
<thead><tr>
<th>Name</th><th>Description</th><th>Value</th><th></th>
</tr></thead>
<tbody data-bind="foreach: Params">
<tr>
<td data-bind="text: ParameterName"></td>
<td data-bind="text: ParameterDescription"></td>
<td data-bind="text: StringValue"></td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<!-- /ko -->
</div>
Any ideas on IE8 work around?
EDIT:
To illustrate this problem on a simpler model, check out this fiddle http://jsfiddle.net/ericpanorel/nzKvb/
I figured that I am running into problems because I am using the "with" or "if" bindings. I read somewhere that this causes problems with IE8.
I used IE9, and if you use your developer tools to switch from IE9 to IE8, this Fiddle doesn't work properly anymore. This fiddle is actually derived from one of knockout's samples (http://knockoutjs.com/examples/gridEditor.html)
EDIT:
I updated the fiddle... http://jsfiddle.net/nzKvb/20/
It has something to do with short-hand closing of tags inside the nested containerless bindings
<!-- ko if: Allowed-->
<h2>
<span data-bind="text: Dummy"/> <===== This will bomb in IE8
</h2>
The jsFiddle had an extra comma at the end of the array, which IE8 was treating as a null object:
var viewModel = new GiftModel([
{ name: "Tall Hat", price: "39.95"},
{ name: "Long Cloak", price: "120.00"},
{ name: "HK 416", price: "2420.00"}, <-- HERE !!!
]);
ko.applyBindings(viewModel);
The fiddle works fine without the comma:
http://jsfiddle.net/XPMUA/
Not sure if this solves your underlying problem but at least the fiddle is working now :-)
The problem is here because
<!-- ko if: Allowed-->
Older IE versions can be picky about using JavaScript reserved words
for property names.
So you should write 'if'
Check Same problem in another link!

Resources