I have a HTML file code which is shown below:
<table id="plans" class="brand-table">
<thead>
<tr>
<th class="domain">Plans</th>
<th class="basic">Basic</th>
<th class="plus">Plus</th>
<th class="prime">Prime</th>
</tr>
</thead>
<tbody>
<tr class="even">
<td>
www.test.com
</td>
<td>
<input name="upgrade" type="radio">
//this span element is hidden
<span class="plan_status"></span>
</td>
<td>
<input name="upgrade" value="plus www.test.com" type="radio">
//this span element is hidden
<span class="plan_status"></span>
</td>
<td>
<input name="upgrade" value="prime www.test.com" checked="" type="radio">
<span class="plan_status">current</span>
</td>
</tr>
</tbody>
</table>
I want to check which plan is the current plan in the page through Ruby Watir. Below is the script:
require 'watir'
browser = Watir::Browser.new(:chrome)
browser.goto('file:///C:/Users/Ashwin/Desktop/new.html')
browser.table(:id, 'plans').tds.each do |table_row|
if table_row.input(:value, 'plus www.test.com').text =~ /current/i
p 'current plan status is plus'
elsif table_row.input(:value, 'prime www.test.com').text =~ /current/i
p 'current plan status is prime'
else
p 'current plan status is basic'
end
end
But I am getting the output as:
C:/Ruby193/lib/ruby/gems/1.9.1/gems/watir-webdriver-0.6.11/lib/watir-webdriver/elements/element.rb:513:in `assert_exists': unable to locate element, using {:value=>"plus www.test.com", :tag_name=>"input"} (Watir::Exception::UnknownObjectException)
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/watir-webdriver-0.6.11/lib/watir-webdriver/elements/element.rb:86:in `text'
from C:/Users/Name/Documents/NetBeansProjects/RubyApplication6/lib/new_main15.rb:8:in `block in <main>'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/watir-webdriver-0.6.11/lib/watir-webdriver/element_collection.rb:29:in `each'
from C:/Ruby193/lib/ruby/gems/1.9.1/gems/watir-webdriver-0.6.11/lib/watir-webdriver/element_collection.rb:29:in `each'
from C:/Users/Name/Documents/NetBeansProjects/RubyApplication6/lib/new_main15.rb:7:in `<main>'
But I want the output to be as:
current plan status is prime
Can anyone please help?
Thanks in advance
Instead of checking which td element has the "current" text, I would suggest checking which radio has the checked attribute. This reduces the number of elements you have to worry about interacting with.
You can find the selected radio using:
table_row.radios.find(&:set?).value
You can then check the value of the radio to see if it starts with the word "plus" or "prime":
# Note that we scope to the tbody to ignore the header row.
# Also make sure you do `trs` not `tds` for the rows.
table_rows = browser.table(id: 'plans').tbody.trs
# Iterate through the rows and check the checked radio button
table_rows.each do |table_row|
case table_row.radios.find(&:set?).value
when /^plus/
p 'current plan status is plus'
when /^prime/
p 'current plan status is prime'
else
p 'current plan status is basic'
end
end
Note that for older versions of Ruby (ie v1.9), you will need to find the selected radio using:
table_row.radios.find { |r| r.set? }.value
Given the following HTML code :
<tr>
<th scope="row" class="navbox-group">Family</th>
<td class="navbox-list navbox-even hlist" style="text-align:left;border-left-width:2px;border-left-style:solid;width:100%;padding:0px">
<div style="padding:0em 0.25em">
<ul>
<li>Andrew Parker Bowles <small>(first husband)</small></li>
<li>Tom Parker Bowles <small>(son)</small></li>
<li>Laura Lopes <small>(daughter)</small></li>
<li>Charles, Prince of Wales <small>(second husband)</small></li>
<li>Bruce Shand <small>(father)</small></li>
<li>Rosalind Shand <small>(mother)</small></li>
<li>Annabel Elliot <small>(sister)</small></li>
<li>Mark Shand <small>(brother)</small></li>
</ul>
</div>
</td>
</tr>
I want to get all the href within the tr element , but only from tr elements that contains :
<th scope="row" class="navbox-group">Family</th>
(Where th='Family')
I try to write the following XPath :
"//tr[#th='Family']//a/#href"
But I don't get any href.
Thanks a lot.
Shany
Try below XPath:
//tr[th="Family"]//#href
It should allow you to get list of links from tr that contains th with text "Family"
I'm trying to work out how to store an html table of drive stats in a database, but the developers have been a bit clever, and started using gifs to represent pass/fail/health stats
Here's a snippet of what I've got:
<tr class="status">
<td class="status"><img border="0" src="/tick_green.gif"></td>
<td class="status">8</td>
<td class="status">Ready</td>
<td class="status"><img border="0" src="/bar10.gif"></td>
<td class="status">SEAGATE ST3146807FC</td>
<td class="status">10000 RPM</td>
<td class="status">3HY61AG9</td>
<td class="status">XR12</td>
<td class="status">286749488</td>
<td class="status"> 28.0°C</td>
<td class="status" style="background-color: #00fa00">
</td>
**
And here's some of the ruby that I've written so far to strip the tags:
table = page.parser.xpath('//table/caption[contains(.,"Drive")]/..')
table.xpath('//table//tr').each do |row|
row.xpath('td').each do |cell|
puts cell.to_html.gsub(/<a[^>]+>/,'').gsub(/<td[^>]+>/,'').gsub(/<\/td[^>]*>/,'').gsub(/<\/a[^>]*>/,'')
#puts cell.text
end
end
I can now get semi-rational output
<img border="0" src="/tick_green.gif">
15
Ready
<img border="0" src="/bar10.gif">
SEAGATE ST3146807FC
10000 RPM
3HY61ASW
XR12
286749488
29.0°C
But I want to replace a couple of other cell elements with other bits
For example, the tick_green can also be '/cross_red.gif' or '/caution.gif' which I want to replace with regular text, likewise, the img bar10.gif, I want to replace with just text of '10'
Is it best to come up with a whole bunch of values for all of my special cases?
I'd do some 'gsub'iing.
E.g.:
example = <<-STRING
<img border="0" src="/tick_green.gif">
15
Ready
<img border="0" src="/bar10.gif">
SEAGATE ST3146807FC
10000 RPM
3HY61ASW
XR12
286749488
29.0°C
STRING
replace = Hash.new("#unknown")
replace['tick_green.gif'] = "[OK]"
replace['bar10.gif'] = "[10]"
regex = /<img [^>]* src="\/(.*)">/
result = example.gsub(regex) { replace[$1] }
Somehow the I'd like to replace the $1 with a named backreference, but don't know how yet.
http://ruby-doc.org/core-1.9.3/String.html#method-i-gsub
edit: result from above
[OK]
15
Ready
[10]
SEAGATE ST3146807FC
10000 RPM
3HY61ASW
XR12
286749488
29.0°C
A case statement will clean that up a little but:
row.css('td').each do |td|
img = td.at('img')
puts case
when img && img[:src][/bar(\d+)\.gif/] then $1
when img && img[:src][/tick_green/] then 'ok'
else td.text.strip
end
end
I have the following piece of HTMl.
<table id="movies">
<thead><tr>
<th>Movie Title</th>
<th>Rating</th>
<th>Release Date</th>
<th>More Info</th>
</tr></thead>
<tbody>
<tr>
<td>The Terminator</td>
<td>R</td>
<td>1984-10-26 00:00:00 UTC</td>
<td>More about The Terminator</td>
</tr>
<tr>
<td>When Harry Met Sally</td>
<td>R</td>
<td>1989-07-21 00:00:00 UTC</td>
<td>More about When Harry Met Sally</td>
</tr>
<tr>
<td>Amelie</td>
<td>R</td>
<td>2001-04-25 00:00:00 UTC</td>
<td>More about Amelie</td>
</tr>
</tbody>
</table>
Now, i want to write the step in my Cucumber in order to check if the specified "ratings" (the second column) are on the page or not.
So, i wrote this (this is a part of the bigger code from my step defination but i checked, everything works till this place):
txt = "//table[#id='movies']/tbody//td[2]"
page.all(:xpath, txt) do |element|
debugger
puts element.text
end
However, there seems to be somewhere a small error, because i never get inside this page.all block... no debugger is invoke, for instance.
Any Help is appreciated :)
After some meditative minutes the problem was fixed im my mind :)
I simply needed to give .each at the end.
txt = "//table[#id='movies']/tbody//td[2]"
page.all(:xpath, txt).each do |element|
debugger
puts element.text
end
I'd like to parse a HTML page with the Nokogiri. There is a table in part of the page which does not use any specific ID. Is it possible to extract something like:
Today,3,455,34
Today,1,1300,3664
Today,10,100000,3444,
Yesterday,3454,5656,3
Yesterday,3545,1000,10
Yesterday,3411,36223,15
From this HTML:
<div id="__DailyStat__">
<table>
<tr class="blh"><th colspan="3">Today</th><th class="r" colspan="3">Yesterday</th></tr>
<tr class="blh"><th>Qnty</th><th>Size</th><th>Length</th><th class="r">Length</th><th class="r">Size</th><th class="r">Qnty</th></tr>
<tr class="blr">
<td>3</td>
<td>455</td>
<td>34</td>
<td class="r">3454</td>
<td class="r">5656</td>
<td class="r">3</td>
</tr>
<tr class="bla">
<td>1</td>
<td>1300</td>
<td>3664</td>
<td class="r">3545</td>
<td class="r">1000</td>
<td class="r">10</td>
</tr>
<tr class="blr">
<td>10</td>
<td>100000</td>
<td>3444</td>
<td class="r">3411</td>
<td class="r">36223</td>
<td class="r">15</td>
</tr>
</table>
</div>
As a quick and dirty first pass I'd do:
html = <<EOT
<div id="__DailyStat__">
<table>
<tr class="blh"><th colspan="3">Today</th><th class="r" colspan="3">Yesterday</th></tr>
<tr class="blh"><th>Qnty</th><th>Size</th><th>Length</th><th class="r">Length</th><th class="r">Size</th><th class="r">Qnty</th></tr>
<tr class="blr">
<td>3</td>
<td>455</td>
<td>34</td>
<td class="r">3454</td>
<td class="r">5656</td>
<td class="r">3</td>
</tr>
<tr class="bla">
<td>1</td>
<td>1300</td>
<td>3664</td>
<td class="r">3545</td>
<td class="r">1000</td>
<td class="r">10</td>
</tr>
<tr class="blr">
<td>10</td>
<td>100000</td>
<td>3444</td>
<td class="r">3411</td>
<td class="r">36223</td>
<td class="r">15</td>
</tr>
</table>
</div>
EOT
# Today Yesterday
# Qnty Size Length Length Size Qnty
# 3 455 34 3454 5656 3
# 1 1300 3664 3545 1000 10
# 10 100000 3444 3411 36223 15
require 'nokogiri'
doc = Nokogiri::HTML(html)
Use CSS to find the start of the table, and define some places to hold the data we're capturing:
table = doc.at('div#__DailyStat__ table')
today_data = []
yesterday_data = []
Loop over the rows in the table, rejecting the headers:
table.search('tr').each do |tr|
next if (tr['class'] == 'blh')
Initialize arrays to capture the pertinent data from each row, selectively push the data into the appropriate array:
today_td_data = [ 'Today' ]
yesterday_td_data = [ 'Yesterday' ]
tr.search('td').each do |td|
if (td['class'] == 'r')
yesterday_td_data << td.text.to_i
else
today_td_data << td.text.to_i
end
end
today_data << today_td_data
yesterday_data << yesterday_td_data
end
And output the data:
puts today_data.map{ |a| a.join(',') }
puts yesterday_data.map{ |a| a.join(',') }
> Today,3,455,34
> Today,1,1300,3664
> Today,10,100000,3444
> Yesterday,3454,5656,3
> Yesterday,3545,1000,10
> Yesterday,3411,36223,15
Just to help you visualize what's going, at the exit from the "tr" loop, the today_data and yesterday_data arrays are arrays-of-arrays looking like:
[["Today", 3, 455, 34], ["Today", 1, 1300, 3664], ["Today", 10, 100000, 3444]]
Alternatively, instead of looping over the "td" tags and sensing the class for the tag, I could have grabbed the contents of the "tr" and then used scan to grab the numbers and sliced the resulting array into "today" and "yesterday" arrays:
tr_data = tr.text.scan(/\d+/).map{ |i| i.to_i }
today_td_data = [ 'Today', *tr_data[0, 3] ]
yesterday_td_data = [ 'Yesterday', *tr_data[3, 3] ]
In real-world development, like at work, I'd use that instead of what I first wrote because it's succinct.
And notice that I didn't use XPath. It's very doable in Nokogiri to use XPath and accomplish this, but for simplicity I prefer CSS accessors. XPath would have allowed accessing individual "td" tag contents, but it also would begin to look like line-noise, which is something we want to avoid when writing code, because it impacts maintenance. I could also have used CSS to drill down to the correct "td" tags like 'tr td.r', but I don't think it would improve the code, it would just be an alternate way of doing it.