Brain storming a method of validating a HTML table's content with Watir-webdriver - ruby

I am trying to work out a method to check the content of an HTML table with Watir-webdriver. Basically I want to validate the table contents against a saved valid table (CSV file) and they are the same after a refresh or redraw action.
Ideas I've come up with so far are to:
Grab the table HTML and compare that as a string with the baseline value.
Iterate through each cell and compare the HTML or text content.
Generate a 2D array representation on the table contents and do an array compare.
What would be the fastest/best approach? Do you have insights on how you handled a similar problem?
Here is an example of the table:
<table id="attr-table">
<thead>
<tr><th id="attr-action-col"><input type="checkbox" id="attr-action-col_box" class="attr-action-box" value=""></th><th id="attr-scope-col"></th><th id="attr-workflow-col">Status</th><th id="attr-type-col"></th><th id="attr-name-col">Name<span class="ui-icon ui-icon-triangle-1-n"></span></th><th id="attr-value-col">Francais Value</th></tr></thead>
<tbody>
<tr id="attr-row-209"><td id="attr_action_209" class="attr-action-col"><input type="checkbox" id="attr_action_209_box" class="attr-action-box" value=""></td><td id="attr_scope_209" class="attr-scope-col"><img src="images/attrib_bullet_global.png" title="global"></td><td id="attr_workflow_209" class="attr-workflow-col"></td><td id="attr_type_209" class="attr-type-col"><img src="images/attrib_text.png"></td><td id="attr_name_209" class="attr-name-col">Name of: Catalogue</td><td id="attr_value_209" class="attr-value-col"><p class="acms ws-editable-content lang_10">2010 EI-176</p></td></tr>
<tr id="attr-row-316"><td id="attr_action_316" class="attr-action-col"><input type="checkbox" id="attr_action_316_box" class="attr-action-box" value=""></td><td id="attr_scope_316" class="attr-scope-col"><img src="images/attrib_bullet_global.png" title="global"></td><td id="attr_workflow_316" class="attr-workflow-col"></td><td id="attr_type_316" class="attr-type-col"><img src="images/attrib_text.png"></td><td id="attr_name_316" class="attr-name-col">_[Key] Media key</td><td id="attr_value_316" class="attr-value-col"><p class="acms ws-editable-content lang_10"><span class="acms acms-choice" contenteditable="false" id="568">163</span></p></td></tr>
<tr id="attr-row-392"><td id="attr_action_392" class="attr-action-col"><input type="checkbox" id="attr_action_392_box" class="attr-action-box" value=""></td><td id="attr_scope_392" class="attr-scope-col"><img src="images/attrib_bullet_global.png" title="global"></td><td id="attr_workflow_392" class="attr-workflow-col"></td><td id="attr_type_392" class="attr-type-col"><img src="images/attrib_numeric.png"></td><td id="attr_name_392" class="attr-name-col">_[Key] Numéro d'ordre</td><td id="attr_value_392" class="attr-value-col"><p class="acms ws-editable-content lang_10">2</p></td></tr>
</tbody>
</table>

Just one idea I came up with. I used Hash and Class object instead of 2D array.
foo.csv
209,global,text.Catalogue,2010 EI-176
392,global,numeric,Numéro d'ordre,2
require 'csv'
expected_datas = CSV.readlines('foo.csv').map do |row|
{
:id => row[0],
:scope => row[1],
:type => row[2],
:name => row[3],
:value => row[4]
}
end
class Data
attr_reader :id,:scope,:type,:name,:value
def initialize(tr)
id = tr.id.slice(/attr-row-([0-9]+)/,1)
scope = tr.td(:id,/scope/).img.src.slice(/attr_bullet_(.+?).png/,1)
type = tr.td(:id,/type/).img.src.slice(/attrib_(.+?).png/,1)
name = tr.td(:id,/name/).text
value = tr.td(:id,/value/).text
end
end
browser = Watir::Browser.new
browser.goto 'foobar'
datas = browser.table(:id,'attr-table').tbody.trs.map{|tr| Data.new(tr)}
datas.zip(expected_datas).each do |data,expected_data|
Data.instance_methods(false).each do |method|
data.send(method).should == expected_data[method.to_sym]
end
end
# something action (refresh or redraw action)
browser.refresh
after_datas = browser.table(:id,'attr-table').tbody.trs.map{|tr| Data.new(tr)}
datas.zip(after_datas).each do |data,after_data|
Data.instance_methods(false).each do |method|
data.send(method).should == after_data.send(method)
end
end

What level of detail do you want the mismatch(es) reported with? I think that might well define the approach you want to take.
For example if you just want to know if there's a mismatch, and don't care where, then comparing arrays might be easiest.
If the order of the rows could vary, then I think comparing Hashes might be best
If you want each mismatch reported individually then iterating by row and column would allow you to report discrete errors, especially if you build a list of differences and then do your assert at the very end based on number of differences found

You could go for exact match
before_htmltable <=> after_htmltable
Or you could strip whitespace
before_htmltable.gsub(/\s+/, ' ') <=> after_htmltable.gsub(/\s+/, ' ')
I would think that creating the array then comparing each element would be more expensive.
Dave

Related

accessing the text value of last nested <tr> element with no id or class hooks

I need to access the value of the 10th <td> element in the last row of a table. I can't use an ID as a hook because only the table has an ID. I've managed to make it work using the code below. Unfortunately, its static. I know I will always need the 10th <td> element, but I won't ever know which row it needs to be. I just know it needs to be the last row in the table. How would I replace "tr[6]" with the actual last <tr> dynamically? (this is probably really easy, but this is literally my first time doing anything with ruby).
page = Nokogiri::HTML(open(url))
test = page.css("tr[6]").map { |row|
row.css("td[10]").text}
puts test
You want to do:
page.at("tr:last td:eq(10)")
If you do not need to do anything else with the page you can actually make this a single line with
test = Nokogiri::HTML(open(url)).search("tr").last.search("td")[10].text
Otherwise (this will work):
page = Nokogiri::HTML(open(url))
test = page.search("tr").last.search("td")[10].text
puts test
Example:(Used a large table from another question on StackOverflow)
Nokogiri::HTML(open("http://en.wikipedia.org/wiki/Richard_Dreyfuss")).search('table')[1].search('tr').last.search('td').children.map{|c| c.text}.join(" ")
#=> "2013 Paranoia Francis Cassidy"
Is there a particular reason you want an Array with 1 element? My example will return a string but you could easily modify it to return an Array.
You can use CSS pseudo class selectors for this:
page.css("table#the-table-id tr:last-of-type td:nth-of-type(10)")
This first selects the <table> with the appropriate id, then selects the last <tr> child of that table, and then selects the 10th <td> of that <tr>. The result is an array of all matching elements, if youexpect there to be only one you could use at_css instead.
If you prefer XPath, you could use this:
page.xpath("//table[#id='the-table-id']/tr[last()]/td[10]")

how to click a link in a table based on the text in a row

Using page-object and watir-webdriver how can I click a link in a table, based on the row text as below:
The table contains 3 rows which have names in the first column, and a corresponding Details link in columns to the right:
DASHBOARD .... Details
EXAMPLE .... Details
and so on.
<div class="basicGridHeader">
<table class="basicGridTable">ALL THE DETAILS:
....soon
</table>
</div>
<div class="basicGridWrapper">
<table class="basicGridTable">
<tbody id="bicFac9" class="ide043">
<tr id="id056">
<td class="bicRowFac10">
<span>
<label class="bicDeco5">
<b>DASHBOARD:</b> ---> Based on this text
</label>
</span>
</td>
<td class="bicRowFac11">
....some element
</td>
<td class="bicRowFac12">
<span>
<a class="bicFacDet5">Details</a> ---> I should able click this link
</span>
</td>
</tr>
</tbody>
</table>
</div>
You could locate a cell that contains the specified text, go to the parent row and then find the details link in that row.
Assuming that there might be other detail links you would want to click, I would define a view_details method that accepts the text of the row you want to locate:
class MyPage
include PageObject
table(:grid){ div_element(:class => 'basicGridWrapper')
.table_element(:class => 'basicGridTable') }
def view_details(label)
grid_element.cell_element(:text => /#{label}/)
.parent
.link_element(:text => 'Details')
.click
end
end
You can then click the link with:
page.view_details('DASHBOARD')
Table elements include the Enumerable module, and I find it very useful in cases like these. http://ruby-doc.org/core-2.0.0/Enumerable.html. You could use the find method to locate and return the row that matches the criteria you are looking for. For example:
class MyPage
include PageObject
table(:grid_table, :class => 'basicGridTable')
def click_link_by_row_text(text_value)
matched_row = locate_row_by_text(text_value)
matched_row.link_element.click
#if you want to make sure you click on the link under the 3rd column you can also do this...
#matched_row[2].link_element.click
end
def locate_row_by_text(text_value)
#find the row that matches the text you are looking for
matched_row = grid_table_element.find { |row| row.text.include? text_value }
fail "Could not locate the row with value #{text_value}" if matched_row.nil?
matched_row
end
end
Here, locate_row_by_text will look for the row that includes the text you are looking for, and will throw an exception if it doesnt find it. Then, once you find the row, you can drill down to the link, and click on it as shown in the click_link_by_row_text method.
Just for posterity, I would like to give an updated answer. It is now possible to traverse through a table using table_element[row_index][column_index].
A little bit more verbose:
row_index could also be the text in a row to be matched - in your case - table_element['DASHBOARD']
Then find the corresponding cell/td element using either the index(zero based) or the header of that column
table_element['DASHBOARD'][2] - Selecting the third element in the
selected row.
Since you do not have a header row (<th> element) you can filter the cell element using the link's class attribute. Something like this
table_element['DASHBOARD'].link_element(:class => 'bicRowFac10').click
So the code would look something like this:
class MyPage
include PageObject
def click_link_by_row_text(text_value)
table_element[text_value].link_element(:class => 'bicRowFac10').click
end
end
Let me know if you need more explanation. Happy to help :)

get node text() with or without anchor tag

I can't figure out how to get a table cell's text() whether or not an anchor tag is parent to the text.
WITH:
<td class="c divComms" title="Komentarz|">
<a id="List1_Dividends_ctl01_HyperLink1" target="_blank" href="http://www.attrader.pl/pl/akcje/DRUKPAK/komunikat/EBI/none,20130104_090845_0000041461">uchwalona</a>
<div class="stcm">2013-01-29</div></td>
WITHOUT:
<td class="c divComms" title="Komentarz|Celem...">
proponowana
<div class="stcm">2012-10-05</div>
</td>
Composing elements of a hash, I would expect
details = rows.collect do |row|
detail = {}
[
[:paystatus, 'td[7]//text()[not(ancestor::div)]'],
[:paydate, 'td[7]/div/text()'], # the 2013-01-29 or 2012-10-05 above
].each do |name, xpath|
detail[name] = row.at_xpath(xpath).to_s.strip
end
to catch either uchwalona or proponowana (notice without the date in the trailing div), but as it stands, it ignores the a tag text, unless I do td[7]/a/text(), in which case only the anchor's text "uchwalona" is read.
Using the union operator | should work:
[:paystatus, '(td[7]|td[7]/a)/text()']
(I think you won't need the [not(ancestor::div)] part if you don't use a double-slash)
The problem appeared to be resolved when I used the row.xpath method instead of .at_xpath, which somehow made the union operator | ineffective.
So changed
detail[name] = row.at_xpath(xpath).to_s.strip
to:
detail[name] = row.xpath(xpath).to_s.strip
This meant I also had to tighten a few xpath expressions in my other field |name, xpath| pairs, to not over-include as unnoticed before.

Get last word inside table cell?

I want to scrape data from a table with Ruby and Nokogiri.
There are a lot of <td> elements, but I only need the country which is just text after a <br> element. The problem is, the <td> elements differ. Sometimes there is more than just the country.
For example:
<td>Title1<br>USA</td>
<td>Title2<br>Michael Powell<br>UK</td>
<td>Title3<br>Leopold Lindtberg<br>Ralph Meeker<br>Switzerland</td>
I want to address the element before the closing </td> tag because the country is always the last element.
How can I do that?
I'd use this:
require 'awesome_print'
require 'nokogiri'
html = '
<td>Title1<br>USA</td>
<td>Title2<br>Michael Powell<br>UK</td>
<td>Title3<br>Leopold Lindtberg<br>Ralph Meeker<br>Switzerland</td>
'
doc = Nokogiri::HTML(html)
ap doc.search('td').map{ |td| td.search('text()').last.text }
[
[0] "USA",
[1] "UK",
[2] "Switzerland"
]
The problem is that your HTML being parsed won't have rows of <td> tags, so you'll have to locate the ones you want to parse. Instead, they'll be interspersed between <tr> tags, and maybe even different <table> tags. Because your HTML sample doesn't show the true structure of the document, I can't help you more.
There are bunch of different solutions. Another solution using only the standard library is to substring out the things you dont want.
node_string = <<-STRING
<td>Title1<br>USA</td>
<td>Title2<br>Michael Powell<br>UK</td>
<td>Title3<br>Leopold Lindtberg<br>Ralph Meeker<br>Switzerland</td>
STRING
node_string.split("<td>").collect do |str|
last_str = str.split("<br>").last
last_str.gsub(/[\n,\<\/td\>]/,'') unless last_str.nil?
end.compact

How can I use Capybara to check that the correct items are listed?

I'm trying to use Capybara to test that a list contains the correct items. For example:
<table id="rodents">
<tr><td class="rodent_id">1</td><td class="rodent_name">Hamster</td></tr>
<tr><td class="rodent_id">2</td><td class="rodent_name">Gerbil</td></tr>
</table>
This list should contain ids 1 and 2, but should not include 3.
What I'd like is something like:
ids = ? # get the contents of each row's first cell
ids.should include(1)
ids.should include(2)
ids.should_not include(3)
How might I do something like that?
I'm answering with a couple of unsatisfactory solutions I've found, but I'd love to see a better one.
Here is a slightly simplified expression:
rodent_ids = page.all('table#rodents td.rodent_id').map(&:text)
From there, you can do your comparisons.
rodent_ids.should include(1)
rodent_ids.should include(2)
rodent_ids.should_not include(3)
Looking for specific rows and ids
A bad solution:
within ('table#rodents tr:nth-child(1) td:nth-child(1)') do
page.should have_content #rodent1.id
end
within ('table#rodents tr:nth-child(2) td:nth-child(1)') do
page.should have_content #rodent1.id
end
page.should_not have_selector('table#rodents tr:nth-child(3)')
This is verbose and ugly, and doesn't really say that id 3 shouldn't be in the table.
Gathering the ids into an array
This is what I was looking for:
rodent_ids = page.all('table#rodents td:nth-child(1)').map{|td| td.text}
From there, I can do:
rodent_ids.should include(1)
rodent_ids.should include(2)
rodent_ids.should_not include(3)
Or just:
rodent_ids.should eq(%w[1 2])
Using has_table?
A bad solution:
page.has_table?('rodents', :rows => [
['1', 'Hamster'],
['2', 'Gerbil']
]
).should be_true
This is very clear to read, but:
It's brittle. If the table structure or text changes at all, it fails.
If it fails, it just says it expected false to be true; I don't know an easy way to compare what the table really looks like with what's expected, other than print page.html
The has_table? method might get removed at some point

Resources