xpath selecting text from link in <td> & text from <td> - ruby

I have the following code which works very well:
rows = diary_HTML.xpath('//*[#id="main"]/div[2]/table/tbody/tr')
food_diary = rows.collect do |row|
detail = {}
[
["Food", 'td[1]/text()'],
["Calories", 'td[2]/text()'],
["Carbs", 'td[3]/text()'],
["Fat", 'td[4]/text()'],
["Protein", 'td[5]/text()'],
["Cholest", 'td[6]/text()'],
].each do |name, xpath|
detail[name] = row.at_xpath(xpath).to_s.strip
end
detail
end
However the "Food" td does not only include text, but also a link from which I want to get the text.
I know I can use 'td[1]/a/text()'to get the link text, but how do I do both?
'td[1]/a/text()' or 'td[1]/text()'
EDITED - Added Snippet.
I am trying to include the <tr class="meal_header">
<td class="first alt">Breakfast</td> on the first row, all lines with other regular tds on other rows whilst excluding td1 on the bottom row.
<tr class="meal_header">
<td class="first alt">Breakfast</td>
<td class="alt">Calories</td>
<td class="alt">Carbs</td>
<td class="alt">Fat</td>
<td class="alt">Protein</td>
<td class="alt">Sodium</td>
<td class="alt">Sugar</td>
</tr>
<tr>
<td class="first alt">
<a onclick="showEditFood(3992385560);" href="#">Hovis (Uk - White Bread (40g) Toasted With Flora Light Marg, 2 slice</a> </td>
<td>262</td>
<td>36</td>
<td>9</td>
<td>7</td>
<td>0</td>
<td>3</td>
</tr>
<tr class="bottom">
<td class="first alt" style="z-index: 10">
Add Food
<div class="quick_tools">
Quick Tools
<div id="quick_tools_0" class="quick_tools_options hidden">
<ul>
<li><a onclick="showLightbox(200, 250, '/food/quick_add?meal=0&date=2013-04-15'); return false;">Quick add calories</a></li>
<li>Remember meal</li>
<li>Copy yesterday</li>
<li>Copy from date</li>
<li>Copy to date</li>
</ul>
</div>
<div id="recent_meals_0" class="recent_meal_options hidden">
<ul id="recent_meal_options_0">
<li class="header">Copy from which date?</li>
<li>Sunday, April 14</li>
<li>Saturday, April 13</li>
</ul>
</div>
</div>
</td>
<td>285</td>
<td>39</td>
<td>9</td>
<td>10</td>
<td>0</td>
<td>3</td>
<td></td>

The short answer is: use Nokogiri::XML::Element#text, it will give the text of the element plus subelements (your a for example).
You can also clean that code up quite a bit:
keys = ["Food", "Calories", "Carbs", "Fat", "Protein", "Cholest"]
food_diary = rows.collect do |row|
Hash[keys.zip row.search('td').map(&:text)]
end
And as a final tip, avoid using xpath with html, css is so much nicer.

I think you can achieve this by altering the logic to look at element content when you don't have an explicit text() extraction in the xpath
rows = diary_HTML.xpath('//*[#id="main"]/div[2]/table/tbody/tr')
food_diary = rows.collect do |row|
detail = {}
[
["Food", 'td[1]'],
["Calories", 'td[2]/text()'],
["Carbs", 'td[3]/text()'],
["Fat", 'td[4]/text()'],
["Protein", 'td[5]/text()'],
["Cholest", 'td[6]/text()'],
].each do |name, xpath|
if xpath.include?('/text()')
detail[name] = row.at_xpath(xpath).to_s.strip
else
detail[name] = row.at_xpath(xpath).content.strip
end
end
detail
end
You could also add e.g. a symbol to the array, to describe how you were extracting the data, and have a case block which handled items depending on what the last stage was to do following the xpath
Note you could also do what you want by walking the node structure returned by xpath recursively, but that seems like overkill if you just want to ignore markup, links etc.

Related

Locating table cell using the header cell text

I have table kind of appearance as shown below but it's not a single table. Header is in one table and rows are in another table.
The header has Primary, Language and the add button which is one table and rest of the two rows are in another table. Now I have to identify the cell using the header text. For an example, If I give 1 Language then it has to locate the first row second cell in which Arabic is chosen. Likewise, If I give 2 Primary it has locate the second row first column.
The HTML code is shown in the pic below. If it's possible to solve this problem, then I will give the actual code.
<div id="d78Pt30" style="width: 35em;" class="gridMaxHeight z-grid">
<div id="d78Pt30-head" class="z-grid-header" style="">
<table id="d78Pt30-headtbl" style="table-layout: fixed;" width="100%">
<colgroup id="d78Px30-hdfaker">
<col id="d78Py30-hdfaker" style="width: 61px;">
<col id="d78Pz30-hdfaker" style="">
<col id="d78P_40-hdfaker" style="width: 50px;">
<col id="d78Px30-hdfaker-bar" style="width: 0px">
</colgroup>
<tbody id="d78Pt30-headrows">
<tr id="d78Px30" class="z-columns" style="text-align: left;">
<th id="d78Py30" class="z-column">
<div id="d78Py30-cave" class="z-column-content">
<div class="z-column-sorticon"><i id="d78Py30-sort-icon"></i></div>
Primary
</div>
</th>
<th id="d78Pz30" class="z-column">
<div id="d78Pz30-cave" class="z-column-content">
<div class="z-column-sorticon"><i id="d78Pz30-sort-icon"></i></div>
Language
</div>
</th>
<th id="d78P_40" class="z-column">
<div id="d78P_40-cave" class="z-column-content">
<div class="z-column-sorticon"><i id="d78P_40-sort-icon"></i></div>
<a id="d78P040" class="z-a" href="javascript:;"><img src="assets/images/add.png"
align="absmiddle"></a></div>
</th>
<th id="d78Px30-bar" class="z-columns-bar"></th>
</tr>
</tbody>
</table>
</div>
<div class="z-grid-header-border"></div>
<div id="d78Pt30-body" class="z-grid-body" style="overflow: auto;">
<table id="d78Pt30-cave" style="table-layout: fixed;" width="100%">
<colgroup id="d78Px30-bdfaker">
<col id="d78Py30-bdfaker" style="width: 61px;">
<col id="d78Pz30-bdfaker" style="">
<col id="d78P_40-bdfaker" style="width: 50px;">
</colgroup>
<tbody id="d78Pi50" class="z-rows">
<tr id="d78P260" class="gridMaxHeight z-row">
<td id="d78P360-chdextr" class="z-row-inner">
<div id="d78P360-cell" class="z-row-content"><span id="d78P360"
class="z-radio z-radio-default"><input
type="radio" id="d78P360-real" name="d78P360" checked="checked"><label for="d78P360-real"
id="d78P360-cnt"
class="z-radio-content"></label></span>
</div>
</td>
<td id="d78P460-chdextr" class="z-row-inner">
<div id="d78P460-cell" class="z-row-content"><span id="d78P460" class="z-combobox"
style="width: 225px;"><input id="d78P460-real"
class="z-combobox-input"
autocomplete="off"
value="" type="text"
size="20"
style="width: 196px;"><a
id="d78P460-btn" class="z-combobox-button"><i id="d78P460-icon"
class="z-combobox-icon z-icon-caret-down"></i></a><div
id="d78P460-pp" style="display: none;"></div></span></div>
</td>
<td id="d78Py60-chdextr" class="z-row-inner">
<div id="d78Py60-cell" class="z-row-content">
<div id="d78Py60" class="z-hlayout">
<div id="d78Pz60-chdex" class="z-hlayout-inner" style=""><a id="d78Pz60" class="z-a"
href="javascript:;"><img
src="assets/images/delete.png" align="absmiddle"></a></div>
</div>
</div>
</td>
</tr>
</tbody>
<tbody class="z-grid-emptybody">
<tr>
<td id="d78Pt30-empty" style="display: none;" colspan="3">
<div id="d78Pt30-empty-content" class="z-grid-emptybody-content">No data available</div>
</td>
</tr>
</tbody>
</table>
</div>
</div>
As you can see in the HTML, there are two tables and first one is having the header and second one is having the rest of the rows.
Since the full markup of the table was not provided, I've used the HTML at the end of the answer to illustrate the possible solutions.
Solution 1 - Hardcode the column index
If the table columns are _static, the easiest solution is to hardcode an index lookup in your code. This solution also makes it easier to deal with header cells that do not have text - eg the add icon.
For example, you know that "Language" header is always column index 1 and "Primary" is always column index 0. Therefore, you know that if you want the "Language" of a data row, it will be the 2nd cell in the row.
def cell_by_header_text(browser, data_row_index, header_text)
columns = ['Primary', 'Language', 'Add'] # must match the order on the page
column_index = columns.index(header_text)
data_table = browser.div(class: 'z-grid-body').table
data_table[data_row_index - 1][column_index] # returns Watir::Cell
end
p cell_by_header_text(browser, 1, 'Language').html
#=> "<td><select><option selected=\"selected\">Arabic</option><option>Bengali</option></select></td>"
p cell_by_header_text(browser, 2, 'Primary').html
#=> "<td><input type=\"radio\" checked=\"\"></td>"
Solution 2 - Dynamic lookup of column index
If the table columns are dynamic or you want a more general solution, you can lookup the column index from the header table.
def cell_by_header_text(browser, data_row_index, header_text)
header_table = browser.div(class: 'z-grid-header').table
column_index = header_table.tds.find_index { |td| td.text == header_text }
data_table = browser.div(class: 'z-grid-body').table
data_table[data_row_index - 1][column_index] # returns Watir::Cell
end
Solution 3 - Domain-specific collection
If you want to improve readability and have more flexibility, you could take it a step further and create a domain-specific collection for the grid:
class LanguageRowCollection
include Enumerable
def initialize(browser)
#browser = browser
end
def each
data_rows.map { |data| yield LanguageRow.new(header_row, data) }
end
def [](value)
to_a[value]
end
private
def header_row
#browser.div(class: 'z-grid-header').table.tr
end
def data_rows
#browser.div(class: 'z-grid-body').table.trs
end
end
class LanguageRow
def initialize(header_row, tr)
#header_row = header_row
#tr = tr
end
def primary_cell
#tr.tds[#header_row.tds.map(&:text).index('Primary')]
end
def primary?
primary_cell.radio.selected?
end
def set_primary(value)
primary_cell.radio.set(value)
end
def language_cell
#tr.tds[#header_row.tds.map(&:text).index('Language')]
end
def language
language_cell.select.text
end
def set_language(value)
language_cell.select.set(value)
end
def remove_cell
# Locating the 3rd column by it's image since it doesn't have text
#tr.tds[#header_row.tds.find_index { |td| td.image(class: 'add').exists? }]
end
def remove
remove_cell.link.click
end
end
def languages(browser)
grid = browser.div(class: 'z-grid')
LanguageRowCollection.new(grid)
end
You get a more readable way to get/set values:
# Get/set the language of the first row (note the 0-based index)
languages(browser)[0].language
#=> "Arabic"
languages(browser)[0].set_language('Bengali')
You also get the flexibility of locating rows based on their values:
# Get the primary language
languages(browser).find(&:primary?).language
#=> "Bengali"
# Remove the Arabic row
languages(browser).find { |l| l.language == 'Arabic' }.remove
HTML Example
The following HTML was used for the above examples.
<html>
<body>
<div id="d78Pt30" class="gridMaxHeight -grid">
<div class="z-grid-header">
<table>
<tr>
<td>Primary</td>
<td>Language</td>
<td>Add</td>
</tr>
</table>
</div>
<div class="z-grid-header-border"></div>
<div class="z-grid-body">
<table>
<tr>
<td><input type="radio"></td>
<td><select><option selected="selected">Arabic</option><option>Bengali</option></select></td>
<td>Minus</td>
</tr>
<tr>
<td><input type="radio" checked></td>
<td><select><option>Arabic</option><option selected="selected">Bengali</option></select></td>
<td>Minus</td>
</tr>
</div>
</div>
</body>
</html>

Using Ruby / Nokogiri to parse randomized class names

I've been doing calculations by hand when it comes to the remaining percentage of the US Presidential election votes in various states. With so many updates and states – this is getting tiring. So why not automate the process?
Here's what I'm looking at:
The problem is that the class names have been randomized. For example, here's the one I'm interested in:
<td class="jsx-3768461732 votes votes-row">2,450,186</td>
Playing around in irb, I tried to use a wildcard on "votes votes-row", since this only appears when I need it in the doc:
require 'nokogiri'
require 'open-uri'
doc = Nokogiri::HTML(open("https://www.politico.com/2020-election/results/georgia/"))
votes = doc.css("[td*='votes-row']")
...which yields no results (=> [])
What am I doing wrong and how to fix? I'm ok with xpath – I just want to make sure changes made elsewhere in the doc don't affect finding these elements.
There's probably a better way but...
require 'nokogiri'
require 'open-uri'
doc = Nokogiri::HTML(open("https://www.politico.com/2020-election/results/georgia/"))
votes = doc.css('tr[class*="candidate-row"]').map { |row| row.css('td').map { |cell| cell.content } }
biden_row = votes.find_index { |row| row[0] =~ /biden/i }
trump_row = votes.find_index { |row| row[0] =~ /trump/i }
biden_votes = votes[biden_row][1].split('%')[1]
trump_votes = votes[trump_row][1].split('%')[1]
Edit: from the HTML source the relevant table looks like:
<table class="jsx-1526769828 candidate-table">
<thead class="jsx-3554868417 table-head">
<tr class="jsx-3554868417">
<th class="table-header jsx-3554868417 candidate-name">
<h5 class="jsx-3554868417">Candidate</h5>
</th>
<th class="table-header jsx-3554868417 percent">
<h5 class="jsx-3554868417">Pct.</h5>
</th>
<th class="table-header jsx-3554868417 vote-bar"></th>
</tr>
</thead>
<tbody class="jsx-2085888330 table-head">
<tr class="jsx-2677388595 candidate-row">
<td class="jsx-3948343365 candidate-name name-row">
<div class="jsx-1912693590 name-only candidate-short-name">Biden</div>
<div class="jsx-3948343365 candidate-party-tag">
<div class="jsx-1420258095 party-label dem">dem</div>
</div>
<div class="jsx-3948343365 candidate-winner-check"></div>
</td>
<td class="jsx-3830922081 percent percent-row">
<div class="candidate-percent-only jsx-3830922081">49.4%</div>
<div class="candidate-votes-next-to-percent jsx-3830922081">2,450,193</div>
</td>
<td class="jsx-3458171655 vote-bar vote-bar-row">
<div style="width:49.4%" class="jsx-3458171655 bar dem"></div>
</td>
</tr>
<tr class="jsx-2677388595 candidate-row">
<td class="jsx-3948343365 candidate-name name-row">
<div class="jsx-1912693590 name-only candidate-short-name">Trump*</div>
<div class="jsx-3948343365 candidate-party-tag">
<div class="jsx-1420258095 party-label gop">gop</div>
</div>
<div class="jsx-3948343365 candidate-winner-check"></div>
</td>
<td class="jsx-3830922081 percent percent-row">
<div class="candidate-percent-only jsx-3830922081">49.4%</div>
<div class="candidate-votes-next-to-percent jsx-3830922081">2,448,635</div>
</td>
<td class="jsx-3458171655 vote-bar vote-bar-row">
<div style="width:49.4%" class="jsx-3458171655 bar gop"></div>
</td>
</tr>
</tbody>
</table>
So you could probably use the candidate-votes-next-to-percent to get this value. e.g.:
require 'nokogiri'
require 'open-uri'
doc = Nokogiri::HTML(open("https://www.politico.com/2020-election/results/georgia/"))
votes = doc.css('tr[class*="candidate-row"]').map do |row|
[
row.css('div[class*="candidate-short-name"]').first.content,
row.css('div[class*="candidate-votes-next-to-percent"]').first.content
]
end
# => [["Biden", "2,450,193"], ["Trump*", "2,448,635"]]

Fetch parent of a specific row in a table without iteration

Consider the below table structure contains many rows with multiple column values. I need to identify the parent of specific row, which has to be identified using the cell .
<table class = 'grid'>
<thead id = 'header'>
</thead>
<tbody>
<tr>
<td>
<span class="group">
<span class="group__link"><a class="disabledlink"">copy</a>
</span>
</span>
</td>
<td class="COLUMNNAME">ACE</td>
<td class="COLUMNLONGNAME">Adverse Childhood Experiences</td>
<li>Family Medicine</li>
<li>General Practice</li>
</td>
<td class="COLUMNSEXFILTER">Both</td>
<td class="COLUMNAGEFILTERMIN">Any</td>
<td class="COLUMNTYPE">Score Only</td>
</tr>
<tr>
<td class="nowrap" showactionitem="2">
<span class="group">
<span class="group__link"><a onclick="Check()" href="#">copy</a>
</span>
</span>
</td>
<td class="COLUMNNAME">AM-PAC</td>
<td class="COLUMNLONGNAME">AM-PAC Generic Outpatient Basic Mobility Short Form</td>
<td class="COLUMNNOTE"></td>
<td class="COLUMNRESTRICTEDYN">No</td>
<td class="COLUMNSPECIALTYID"></td>
<td class="COLUMNSEXFILTER">Both</td>
<td class="COLUMNAGEFILTERMIN">Any</td>
<td class="COLUMNTYPE">Score Only</td>
</tr>
<tr></tr>
<tr></tr>
</tbody></thead>
</table>
Likewise this table contains around 100 rows. I did the same using iteration and it is working fine.
Is it possible to find the parent of specific row without iteration?
You can use the parent method to find the parent of an element. Assuming that you have located a table cell, let's call it cell, you can get its row using parent and then the parent of the row with another call to parent:
cell.parent
#=> a <tr> element
cell.parent.parent
#=> the parent of the specific row - a <tbody> element in this case
Chaining multiple parent calls can become tedious and difficult to maintain. For example, you would have to call parent 4 times to get the table cell of the "copy" link. If you are after an ancestor (ie not immediate parent), you are better off using XPath:
cell.table(xpath: './ancestor::table')
#=> the <table> element containing the cell
browser.link(text: 'copy').tr(xpath: './ancestor::tr')
#=> the <tr> element containing a copy link
Hopefully Issue 451 will be implemented soon, which will remove the need for XPath. You would be able to call:
cell.parent(tag_name: 'table') # equivalent to `cell.table(xpath: './ancestor::table')`
There's no need for anything fancy, Watir has an Element#parent method.
You can use this one:
parent::node()
The below example will selects the parent node of the input tag of Id='email'.
Ex: //input[#id='email']/parent::*
the above can also be re-written as
//input[#id='email']/..
XPath tutorial for Selenium

Getting attributed html element

I'm trying to get table with content of MMEL codes from this site and I'm trying to accomplish it with CSS Selectors.
What I've got so far is:
require_relative 'sources/Downloader'
require 'nokogiri'
html_content = Downloader.download_page('http://www.s-techent.com/ATA100.htm')
parsed_html = Nokogiri::HTML(html_content)
tmp = parsed_html.css("tr[*]")
puts tmp.text
And I'm getting error while trying to get this tr with attribute. How can I complete this task to get this table in simple form because I want to parse it to JSON. It would be nice go get this in sections and call it in.each block.
EDIT:
I'd be nic if I can get things in block like this (look into pages source)
<TR><TD WIDTH="10%" VALIGN="TOP" ROWSPAN=5>
<B><FONT FACE="Arial" SIZE=2><P ALIGN="CENTER">11</B></FONT></TD>
<TD WIDTH="40%" VALIGN="TOP" COLSPAN=2>
<B><FONT FACE="Arial" SIZE=2><P>PLACARDS AND MARKINGS</B></FONT></TD>
<TD WIDTH="50%" VALIGN="TOP">
<FONT FACE="Arial" SIZE=2><P ALIGN="LEFT">All procurable placards, labels, etc., shall be included in the illustrated Parts Catalog. They shall be illustrated, showing the part number, Legend and Location. The Maintenance Manual shall provide the approximate Location (i.e., FWD -UPPER -RH) and illustrate each placard, label, marking, self -illuminating sign, etc., required for safety information, maintenance significant information or by government regulations. Those required by government regulations shall be so identified.</FONT></TD>
</TR>
This should print all those TR's from source at line 96. There are three tables in that page and table[1] has all the text you needed:
require 'nokogiri'
doc = Nokogiri::HTML(open('http://www.s-techent.com/ATA100.htm'))
doc.css("table")[1].css("tr").each do |i|
puts i #=> prints the exact html between TR tags (including)
puts i.text #=> prints the text
end
For instance:
puts doc.css("table")[1].css("tr")[2]
prints the following:
<tr>
<td valign="TOP" colspan="3">
<b><font face="Arial" size="2"><p align="CENTER">GROUP DEFINITION - AIRCRAFT</p></font></b>
</td>
<td valign="TOP">
<font face="Arial" size="2"><p align="LEFT">The complete operational unit. Includes dimensions and
areas, lifting and shoring, leveling and weighing, towing and taxiing, parking and mooring, requi
red placards, servicing.</p></font>
</td>
</tr>
You could do the same using xpath also:
Below is the content from the first table of the webpage given in the post by OP:
require 'nokogiri'
require 'open-uri'
doc = Nokogiri.HTML(open('http://www.s-techent.com/ATA100.htm'))
doc.xpath('(//table)[1]/tr').each do |tr|
puts tr.to_html(:encoding => 'utf-8')
end
Output:
<tr>
<td width="33%" valign="MIDDLE" colspan="2">
<p><img src="S-Tech-Logo-Blue2.gif" width="274" height="127"></p>
</td>
<td width="67%" valign="MIDDLE">
<b><i><font face="Arial" color="#0000ff">
<p align="CENTER"><big>AIRCRAFT PARTS MANUFACTURING ASSISTANCE (PMA)</big><br><big>DAR SERVICES</big></p></font></i></b>
</td>
</tr>
Now, if you want to collect the last table rows, then do:
require 'nokogiri'
require 'open-uri'
doc = Nokogiri.HTML(open('http://www.s-techent.com/ATA100.htm'))
p doc.xpath('(//table)[3]/tr').to_a.size # => 1
doc.xpath('(//table)[3]/tr').each do |tr|
puts tr.to_html(:encoding => 'utf-8')
end
Output:
<tr>
<td width="40%" valign="TOP" height="10">
<p align="CENTER"><b><font face="Arial" size="2" color="#0000ff">149 AZALEA CIRCLE • LIMERICK, PA 19468-1330</font></b></p>
</td>
<td width="30%" valign="TOP" height="10">
<p align="CENTER"><b><font face="Arial" size="2" color="#0000ff">610-495-6898 (Office) • 484-680-0507 (Cell)</font></b></p>
</td>
<td width="110%" valign="TOP" height="10">
<p align="CENTER"><b><font face="Arial" size="2">E-mail S-Tech</font></b></p>
</td>
</tr>

Automating expanding trees and selecting checkboxes with Watir

On a site, there is tree of nodes each associated with a checkbox. Those nodes then expand further into more checkboxes.
It looks similar to the following where [] represents a checkbox:
+ [] All
+ [] Fruit
+ [] Vegetables
and then expanded looks like this:
+ [] All
- [] Fruit
[] apple
- [] Vegetables
[] potato
[] cucumber
There is then a button at the bottom of the screen that when pressed, gives you the price of the selected item.
I would like to write a script in Watir that does the following sequence of events:
1) Expands the node Fruit
2) Checks apple
3) Clicks the run button
4) Unchecks apple
5) Expands the node Vegetables
6) Checks potato
7) Clicks the run button
8) Unchecks potato
etc.. for all checkboxes and nodes
The tag for the apple checkbox looks like this:
<td onmouseover="TreeView_HoverNode(ContentPlaceHolder1_tvPartners_Data, this)" onmouseout="TreeView_UnhoverNode(this)" style="white-space:nowrap;" class="">
<input type="checkbox" name="ContentPlaceHolder1_tvPartnersn2CheckBox" id="ContentPlaceHolder1_tvPartnersn2CheckBox">
<a class="ContentPlaceHolder1_tvPartners_0" href="javascript:__doPostBack('ctl00$ContentPlaceHolder1$tvPartners','s0\\0\\189')" onclick="TreeView_SelectNode(ContentPlaceHolder1_tvPartners_Data, this,'ContentPlaceHolder1_tvPartnerst2');" id="ContentPlaceHolder1_tvPartnerst2">
<font style="color:#FF0000;">apple</font>
</td>
Also, there are more nodes and items that will be added to the list later so I need the script to take that into account and go through the checkboxes in order without calling the specific id of the checkbox.
Any help in solving this problem would greatly appreciated. Thank you very much!
Update:
I came up with this code but I'd like to check for the last checkbox/node rather than hardcoding the for loops. I would also like to skip the checkboxes for nodes such as Fruit and Vegetable.
for n in (1...250)
nodename = "ContentPlaceHolder1_tvPartnersn" + n.to_s
if ie.a(:id => nodename).exists?
ie.link(:id, nodename).click
end
end
ie.checkbox(:id => "ContentPlaceHolder1_tvPartnersn0CheckBox").clear
x = 1
for r in (1...250)
checkboxname = "ContentPlaceHolder1_tvPartnersn" + x.to_s + "CheckBox"
nodename = "ContentPlaceHolder1_tvPartnersn" + r.to_s
if ie.a(:id => nodename).exists?
x = x+1
checkboxname = "ContentPlaceHolder1_tvPartnersn" + x.to_s + "CheckBox"
end
if ie.checkbox(:id => checkboxname).exists?
ie.checkbox(:id => checkboxname).set
puts x
ie.checkbox(:id => checkboxname).clear
end
x = x + 1
end
Update: Here is more of the HTML. I actually have a hash set up as itemlist[[n,"item"]]=item, for example itemlist[[1,"item"]] = apple, that contains all the items I need to check the price on. Is there a way/would it be easier to see what the text is for each checkbox and then if itemlist.has_value?(checkbox_text) then check the box and assign that text to another hash? Basically, is there a way to check the boxes according to the text rather than the id of the checkbox?
<td><a id="ContentPlaceHolder1_tvPartnersn0" href="javascript:TreeView_ToggleNode(ContentPlaceHolder1_tvPartners_Data,0,document.getElementById('ContentPlaceHolder1_tvPartnersn0'),'-',document.getElementById('ContentPlaceHolder1_tvPartnersn0Nodes'))"><img src="/WebResource.axd?d=VNrMPzAA2o87avzl3UgiY8OisS6wrOp46COe6QqNhDQHCsy9zX-GTuzHAKk7njulOEns3hNoLIxbv9x1bv530iY_Shsd9ZHlF3pm4jNQi6u0zB6atkT0-K9kirzHDQNHYxlY8Q2&t=634963835619397560" alt="Collapse All (133,060)" style="border-width:0;" /></a></td><td onmouseover="TreeView_HoverNode(ContentPlaceHolder1_tvPartners_Data, this)" onmouseout="TreeView_UnhoverNode(this)" style="white-space:nowrap;"><input type="checkbox" name="ContentPlaceHolder1_tvPartnersn0CheckBox" id="ContentPlaceHolder1_tvPartnersn0CheckBox" /><a class="ContentPlaceHolder1_tvPartners_0" href="javascript:__doPostBack('ctl00$ContentPlaceHolder1$tvPartners','s0')" onclick="TreeView_SelectNode(ContentPlaceHolder1_tvPartners_Data, this,'ContentPlaceHolder1_tvPartnerst0');" id="ContentPlaceHolder1_tvPartnerst0">All (133,060)</a></td>
</tr>
</table><div id="ContentPlaceHolder1_tvPartnersn0Nodes" style="display:block;">
<table cellpadding="0" cellspacing="0" style="border-width:0;">
<tr>
<td><div style="width:20px;height:1px"></div></td><td><a id="ContentPlaceHolder1_tvPartnersn1" href="javascript:TreeView_ToggleNode(ContentPlaceHolder1_tvPartners_Data,1,document.getElementById('ContentPlaceHolder1_tvPartnersn1'),'t',document.getElementById('ContentPlaceHolder1_tvPartnersn1Nodes'))"><img src="/WebResource.axd?d=D2aGfOHUjBmg4quHNr-mKkyc5juoGHdurzZqtoCU3qo2d457eKX9x0d2AS3LrrQULzPjC-9wC6hLlMxSFEvU6c9r8LmzgOeKWAi6ouEEkShvclKr0&t=634963835619397560" alt="Expand Ace Communications Group (0) <img src='images/emergency.png' alt='alert' />" style="border-width:0;" /></a></td><td onmouseover="TreeView_HoverNode(ContentPlaceHolder1_tvPartners_Data, this)" onmouseout="TreeView_UnhoverNode(this)" style="white-space:nowrap;"><input type="checkbox" name="ContentPlaceHolder1_tvPartnersn1CheckBox" id="ContentPlaceHolder1_tvPartnersn1CheckBox" /><a class="ContentPlaceHolder1_tvPartners_0" href="javascript:__doPostBack('ctl00$ContentPlaceHolder1$tvPartners','s0\\0')" onclick="TreeView_SelectNode(ContentPlaceHolder1_tvPartners_Data, this,'ContentPlaceHolder1_tvPartnerst1');" id="ContentPlaceHolder1_tvPartnerst1">Fruit </a></td>
</tr>
</table><div id="ContentPlaceHolder1_tvPartnersn1Nodes" style="display:none;">
<table cellpadding="0" cellspacing="0" style="border-width:0;">
<tr>
<td><div style="width:20px;height:1px"></div></td><td><div style="width:20px;height:1px"><img src="/WebResource.axd?d=UZyrk961AUQRa1Dg14aXeNUU3AZcfF9PiakU0o_cO8MfbyWz58k50vr47p2ICDOjgAqF5UX_lVIhbj_y2BqKRU5Xwhic3cBNooK1CBd_cGP6COn60&t=634963835619397560" alt="" /></div></td><td><img src="/WebResource.axd?d=OftTkmJCEf6tGohvvdo_cbMxdnyHMLxScANk1YxbAhfKKp3_gvqoKFAIbK4gGFAKMagH78cKVSIS61WrK5fGcCaHWVUMPjXLTtDZIJISdqqtXFNI0&t=634963835619397560" alt="" /></td><td onmouseover="TreeView_HoverNode(ContentPlaceHolder1_tvPartners_Data, this)" onmouseout="TreeView_UnhoverNode(this)" style="white-space:nowrap;"><input type="checkbox" name="ContentPlaceHolder1_tvPartnersn2CheckBox" id="ContentPlaceHolder1_tvPartnersn2CheckBox" /><a class="ContentPlaceHolder1_tvPartners_0" href="javascript:__doPostBack('ctl00$ContentPlaceHolder1$tvPartners','s0\\0\\189')" onclick="TreeView_SelectNode(ContentPlaceHolder1_tvPartners_Data, this,'ContentPlaceHolder1_tvPartnerst2');" id="ContentPlaceHolder1_tvPartnerst2"><font style='color:#FF0000;'>apple</font> </a></td>
</tr>
</table>
</div><table cellpadding="0" cellspacing="0" style="border-width:0;">
<tr>
<td><div style="width:20px;height:1px"></div></td><td><a id="ContentPlaceHolder1_tvPartnersn3" href="javascript:TreeView_ToggleNode(ContentPlaceHolder1_tvPartners_Data,3,document.getElementById('ContentPlaceHolder1_tvPartnersn3'),'t',document.getElementById('ContentPlaceHolder1_tvPartnersn3Nodes'))"><img src="/WebResource.axd?d=D2aGfOHUjBmg4quHNr-mKkyc5juoGHdurzZqtoCU3qo2d457eKX9x0d2AS3LrrQULzPjC-9wC6hLlMxSFEvU6c9r8LmzgOeKWAi6ouEEkShvclKr0&t=634963835619397560" alt="Expand Advantage (0)" style="border-width:0;" /></a></td><td onmouseover="TreeView_HoverNode(ContentPlaceHolder1_tvPartners_Data, this)" onmouseout="TreeView_UnhoverNode(this)" style="white-space:nowrap;"><input type="checkbox" name="ContentPlaceHolder1_tvPartnersn3CheckBox" id="ContentPlaceHolder1_tvPartnersn3CheckBox" /><a class="ContentPlaceHolder1_tvPartners_0" href="javascript:__doPostBack('ctl00$ContentPlaceHolder1$tvPartners','s0\\0')" onclick="TreeView_SelectNode(ContentPlaceHolder1_tvPartners_Data, this,'ContentPlaceHolder1_tvPartnerst3');" id="ContentPlaceHolder1_tvPartnerst3">Vegetable</a></td>
</tr>
</table><div id="ContentPlaceHolder1_tvPartnersn3Nodes" style="display:none;">
<table cellpadding="0" cellspacing="0" style="border-width:0;">
<tr>
<td><div style="width:20px;height:1px"></div></td><td><div style="width:20px;height:1px"><img src="/WebResource.axd?d=UZyrk961AUQRa1Dg14aXeNUU3AZcfF9PiakU0o_cO8MfbyWz58k50vr47p2ICDOjgAqF5UX_lVIhbj_y2BqKRU5Xwhic3cBNooK1CBd_cGP6COn60&t=634963835619397560" alt="" /></div></td><td><img src="/WebResource.axd?d=yCq0KCcfK0lqwrgCU1UxuFJ0bJHMKjxD6S5t8OvIWXwTUBOYh1ZiQA4lD3ZpRuMNI-itrPIn3_rFzvZtrMP5g7PyyensT-Z003WldrY9pIgMSY5p0&t=634963835619397560" alt="" /></td><td onmouseover="TreeView_HoverNode(ContentPlaceHolder1_tvPartners_Data, this)" onmouseout="TreeView_UnhoverNode(this)" style="white-space:nowrap;"><input type="checkbox" name="ContentPlaceHolder1_tvPartnersn4CheckBox" id="ContentPlaceHolder1_tvPartnersn4CheckBox" /><a class="ContentPlaceHolder1_tvPartners_0" href="javascript:__doPostBack('ctl00$ContentPlaceHolder1$tvPartners','s0\\0\\119')" onclick="TreeView_SelectNode(ContentPlaceHolder1_tvPartners_Data, this,'ContentPlaceHolder1_tvPartnerst4');" id="ContentPlaceHolder1_tvPartnerst4">potato</a></td>
</tr>
</table><table cellpadding="0" cellspacing="0" style="border-width:0;">
<tr>
<td><div style="width:20px;height:1px"></div></td><td><div style="width:20px;height:1px"><img src="/WebResource.axd?d=UZyrk961AUQRa1Dg14aXeNUU3AZcfF9PiakU0o_cO8MfbyWz58k50vr47p2ICDOjgAqF5UX_lVIhbj_y2BqKRU5Xwhic3cBNooK1CBd_cGP6COn60&t=634963835619397560" alt="" /></div></td><td><img src="/WebResource.axd?d=yCq0KCcfK0lqwrgCU1UxuFJ0bJHMKjxD6S5t8OvIWXwTUBOYh1ZiQA4lD3ZpRuMNI-itrPIn3_rFzvZtrMP5g7PyyensT-Z003WldrY9pIgMSY5p0&t=634963835619397560" alt="" /></td><td onmouseover="TreeView_HoverNode(ContentPlaceHolder1_tvPartners_Data, this)" onmouseout="TreeView_UnhoverNode(this)" style="white-space:nowrap;"><input type="checkbox" name="ContentPlaceHolder1_tvPartnersn5CheckBox" id="ContentPlaceHolder1_tvPartnersn5CheckBox" /><a class="ContentPlaceHolder1_tvPartners_0" href="javascript:__doPostBack('ctl00$ContentPlaceHolder1$tvPartners','s0\\0\\1')" onclick="TreeView_SelectNode(ContentPlaceHolder1_tvPartners_Data, this,'ContentPlaceHolder1_tvPartnerst5');" id="ContentPlaceHolder1_tvPartnerst5">cucumber</a></td>
</tr>
Instead of hard-coding a loop for 1 to 250, you should be using collections to find all elements that match a particular criteria.
So instead of
for n in (1...250)
nodename = "ContentPlaceHolder1_tvPartnersn" + n.to_s
if ie.a(:id => nodename).exists?
ie.link(:id, nodename).click
end
end
It should be
ie.links(:id => /ContentPlaceHolder1_tvPartnersn/).each do |link|
link.click
end
Which could also be written as
ie.links(:id => /ContentPlaceHolder1_tvPartnersn/).each(&:click)
The key here is that ie.links(:id => /ContentPlaceHolder1_tvPartnersn/) returns all links that match the criteria or locator. A regular expression (regex) is used for checking the id as this allows partial matching of ids (ie any link id that contains the text ContentPlaceHolder1_tvPartnersn).
The same logic could be applied to the rest of your script (note that without a complete html sample, I am guessing a bit on what your code does):
#Expand the entire tree
ie.links(:id => /ContentPlaceHolder1_tvPartnersn/).each(&:click)
#Clear the first checkbox
ie.checkbox(:id => "ContentPlaceHolder1_tvPartnersn0CheckBox").clear
#Set and clear each checkbox
checkbox_id = /ContentPlaceHolder1_tvPartnersn/
ie.checkboxes(:id => checkbox_id).each do |checkbox|
checkbox.set
puts checkbox.id
checkbox.clear
end
Depending on the html for the 'fruit' and 'vegetable' nodes, they may already get ignored by this code. Otherwise, you might need to change the checkbox_id or add a conditional check.
Update
If you want to set the checkbox based on the text, you can do:
ie.td(:text => 'apple').checkbox.set

Resources