I am trying to scan rows in a HTML table using partial href xpath and perform further tests with that row's other column values.
<div id = "blah">
<table>
<tr>
<td>link</td>
<td>29 33 485</td>
<td>45.2934,00 EUR</td>
</tr>
<tr>
<td>link</td>
<td>22 93 485</td>
<td>38.336.934,123 EUR</td>
</tr>
<tr>
<td>link</td>
<td>394 27 3844</td>
<td>3.485,2839 EUR</td>
</tr>
</table>
</div>
In cucumber-jvm step definition, I performed this much easily like below (I am more comfortable using Ruby)
#Given("^if there are...$")
public void if_there_are...() throws Throwable {
...
...
baseTable = driver.findElement(By.id("blah"));
tblRows = baseTable.findElements(By.tagName("tr"));
for(WebElement row : tblRows) {
if (row.findElements(By.xpath(".//a[contains(#href,'key=HONDA')]")).size() > 0) {
List<WebElement> col = row.findElements(By.tagName("td"));
tblData dummyThing = new tblData();
dummyThing.col1 = col.get(0).getText();
dummyThing.col2 = col.get(1).getText();
dummyThing.col3 = col.get(2).getText();
dummyThing.col4 = col.get(3).getText();
dummyThings.add(dummyThing);
}
}
I am clueless here
page.find('#blah').all('tr').each { |row|
# if row matches xpath then grab that complete row
# so that other column values can be verified
# I am clueless from here
row.find('td').each do { |c|
}
page.find('#blah').all('tr').find(:xpath, ".//a[contains(#href,'key=HONDA')]").each { |r|
#we got the row that matches xpath, let us do something
}
}
I think you are looking to do:
page.all('#blah tr').each do |tr|
next unless tr.has_selector?('a[href*="HONDA"]')
# Do stuff with trs that meet the href requirement
puts tr.text
end
#=> link 29 33 485 45.2934,00 EUR
#=> link 22 93 485 38.336.934,123 EUR
This basically says to:
Find all trs in the element with id 'blah'
Iterate through each of the trs
If the tr does not have a link that has a href containing HONDA, ignore it
Otherwise, output the text of the row (that matches the criteria). You could do whatever you need with the tr here.
You could also use xpath to collapse the above into a single statement. However, I do not think it is as readable:
page.all(:xpath, '//div[#id="blah"]//tr[.//a[contains(#href, "HONDA")]]').each do |tr|
# Do stuff with trs that meet the href requirement
puts tr.text
end
#=> link 29 33 485 45.2934,00 EUR
#=> link 22 93 485 38.336.934,123 EUR
Here is an example of how to inspect each matching row's link url and column values:
page.all('#blah tr').each do |tr|
next unless tr.has_selector?('a[href*="HONDA"]')
# Do stuff with trs that meet the href requirement
href = tr.find('a')['href']
column_value_1 = tr.all('td')[1].text
column_value_2 = tr.all('td')[2].text
puts href, column_value_1, column_value_2
end
#=> file:///C:/Scripts/Misc/Programming/Capybara/afile?key=HONDA
#=> 29 33 485
#=> 45.2934,00 EUR
#=> file:///C:/Scripts/Misc/Programming/Capybara/afile?key=HONDA
#=> 22 93 485
#=> 38.336.934,123 EUR
If you need the table row, you could probably use something like the ancestor method:
anchors = page.all('#blah a[href*="HONDA"]')
trs = anchors.map { |anchor| anchor.ancestor('tr') }
Related
I have an input text file "input.txt" that looks like this:
Country Code ID QTY
FR B000X2D 75 130
FR B000X2E 75 150
How do I extract the first, second and the third string from each line?
This code maps a whole line into one field of array:
f = File.open("input.txt", "r")
line_array = []
f.each_line { |line| line_array << line }
f.close
puts line_array[1]
Which outputs:
FR B000X2D 75 130
Furthermore, how can I split one line into more lines based on a quantity number,
max(quantity) = 50 per line
so that the output is:
FR B000X2D 75 50
FR B000X2D 75 50
FR B000X2D 75 30
If this is space delimited, should be pretty easy to split things up:
File.readlines('input.txt').map do |line|
country, code, id, qty = line.chomp.split(/\s+/)
[ country, code, id.to_i, qty.to_i ]
end
You can also easily reject any rows you don't want, or select those you do, plus this helps with stripping off headers:
File.readlines('input.txt').reject do |line|
line.match(/\ACountry/i)
end.map do |line|
country, code, id, qty = line.chomp.split(/\s+/)
[ country, code, id.to_i, qty.to_i ]
end.select do |country, code, id, qty|
qty <= 50
end
Use the CSV class if these are tab separated entries. CSV stands for "comma separated values" but the you can provide your own separator
require 'csv'
CSV.foreach("fname", :row_sep => "\t") do |row|
# use row here...
end
See https://ruby-doc.org/stdlib-2.0.0/libdoc/csv/rdoc/CSV.html
I have a table with the following structure:
<table class="table_class">
<tr>
<td>Label A</td>
<td>Value A</td>
<td>Label B</td>
<td><div>Value BChange</div></td>
</tr>
<tr>
<td>Label C</td>
<td><div>Value C</div></td>
<td>Label D</td>
<td><div><span><span><img src="image/source.jpg"<img src="another/image.gif"></span>Value D</span> Change</div></td>
</tr>
</table>
I would like to get the the values ("Value A", "Value B", ...), but the only unique identifier for the table cells containing those values, are the table cells left to them ("Label A", "Label B", ...).
Any idea how to handle this properly within a PageObject?
Thanks in advance,
Christian
You could use an XPath with the following-sibling axis to find the values of adjacent cells.
For example, the following page object has a method that will find the label cell based on its text. From there, navigate to the next td element, which should be the associated value.
class MyPage
include PageObject
def value_of(label)
# Find the table
table = table_element(class: 'table_class')
# Find the cell containing the desired label
label_cell = cell_element(text: label)
# Get the next cell, which will be the value
value_cell = label_cell.cell_element(xpath: './following-sibling::td[1]')
value_cell.text
end
end
page = MyPage.new(browser)
p page.value_of('Label A')
#=> "Value A"
p page.value_of('Label B')
#=> "Value BChange"
Depending on your goals, you could also refactor this to use the accessor methods. This would allow you to have the methods for returning the value cell, its text, checking its existence, etc:
class MyPage
include PageObject
cell(:value_a) { value_of('Label A') }
cell(:value_b) { value_of('Label B') }
def value_of(label)
table = table_element(class: 'table_class')
label_cell = cell_element(text: label)
value_cell = label_cell.cell_element(xpath: './following-sibling::td[1]')
value_cell
end
end
page = MyPage.new(browser)
p page.value_a
#=> "Value A"
p page.value_a?
#=> true
Given an html file:
<div>
<div class="NormalMid">
<span class="style-span">
"Data 1:"
1
2
</span>
</div>
...more divs
<div class="NormalMid">
<span class="style-span">
"Data 20:"
20
21
22
23
</span>
</div>
...more divs
</div
Using these SO posts as reference:
How do I integrate these two conditions block codes to mine in Ruby?
and
How to understand this Arrays and loops in Ruby?
My code:
require 'nokogiri'
require 'pp'
require 'open-uri'
data_file = 'site.htm'
file = File.open(data_file, 'r')
html = open(file)
page = Nokogiri::HTML(html)
page.encoding = 'utf-8'
rows = page.xpath('//div[#class="NormalMid"]')
details = rows.collect do |row|
detail = {}
[
[row.children.first.element_children,row.children.first.element_children],
].each do |part, link|
data = row.children[0].children[0].to_s.strip
links = link.collect {|item| item.at_xpath('#href').to_s.strip}
detail[data.to_sym] = links
end
detail
end
details.reject! {|d| d.empty?}
pp details
The output:
[{:"Data 1:"=>
["http://www.site.com/data/1",
"http://www.site.com/data/2"]},
...
{:"Data 20 :"=>
["http://www.site.com/data/20",
"http://www.site.com/data/21",
"http://www.site.com/data/22",
"http://www.site.com/data/20",]},
...
}]
Everything is going good, exactly what I wanted.
BUT if you change these lines of code:
detail = {}
[
[row.children.first.element_children,row.children.first.element_children],
].each do |part, link|
to:
detail = {}
[
[row.children.first.element_children],
].each do |link|
I get the output of
[{:"Data 1:"=>
["http://www.site.com/data/1"]},
...
{:"Data 20 :"=>
["http://www.site.com/data/20"]},
...
}]
Only the first anchor href is stored in the array.
I just need some clarification on why its behaving that way because the argument part in the argument list is not being used, I figure I didn't need it there. But my program doesn't work correctly if I delete the corresponding row.children.first.element_children as well.
What is going on in the [[obj,obj],].each do block? I just started ruby a week ago, and I'm still getting used to the syntax, any help will be appreciated. Thank You :D
EDIT
rows[0].children.first.element_children[0] will have the output
Nokogiri::XML::Element:0xcea69c name="a" attributes=[#<Nokogiri::XML::Attr:0xcea648
name="href" value="http://www.site.com/data/1">] children[<Nokogiri::XML::Text:0xcea1a4
"1">]>
puts rows[0].children.first.element_children[0]
1
You made your code overly complicated. Looking at your code,it seems you are trying to get something like below:
require 'nokogiri'
doc = Nokogiri::HTML::Document.parse <<-eotl
<div>
<div class="NormalMid">
<span class="style-span">
"Data 1:"
1
2
</span>
</div>
<div class="NormalMid">
<span class="style-span">
"Data 20:"
20
21
22
23
</span>
</div>
</div
eotl
rows = doc.xpath("//div[#class='NormalMid']/span[#class='style-span']")
val = rows.map do |row|
[row.at_xpath("./text()").to_s.tr('"','').strip,row.xpath(".//#href").map(&:to_s)]
end
Hash[val]
# => {"Data 1:"=>["http://site.com/data/1", "http://site.com/data/2"],
# "Data 20:"=>
# ["http://site.com/data/20",
# "http://site.com/data/21",
# "http://site.com/data/22",
# "http://site.com/data/23"]}
What is going on in the [[obj,obj],].each do block?
Look the below 2 parts:
[[1],[4,5]].each do |a|
p a
end
# >> [1]
# >> [4, 5]
[[1,2],[4,5]].each do |a,b|
p a, b
end
# >> 1
# >> 2
# >> 4
# >> 5
HTML Code:
<div id="empid" title="Please first select a list to filter!"><input value="5418630" name="candidateprsonIds" type="checkbox">foo <input value="6360899" name="candidateprsonIds" type="checkbox"> bar gui<input value="9556609" name="candidateprsonIds" type="checkbox"> bab </div>
Now I would like to get the below using selenium-webdriver as
[[5418630,foo],[6360899,bar gui],[9556609,bab]]
Can it be done?
I tried the below code:
driver.find_elements(:id,"filtersetedit_fieldNames").each do |x|
puts x.text
end
But it is giving me the data as string "foo bar gui bab" on my console. Thus couldn't figure out - how to create such above expected Hash.
Any help on this regard?
The only way I know to get the text nodes like that would be to use the execute_script method.
The following script would give you the hash of option values and their following text.
#The div containing the checkboxes
checkbox_div = driver.find_element(:id => 'empid')
#Get all of the option values
option_values = checkbox_div.find_elements(:css => 'input').collect{ |x| x['value'] }
p option_values
#=> ["5418630", "6360899", "9556609"]
#Get all of the text nodes (by using javascript)
script = <<-SCRIPT
text_nodes = [];
for(var i = 0; i < arguments[0].childNodes.length; i++) {
child = arguments[0].childNodes[i];
if(child.nodeType == 3) {
text_nodes.push(child.nodeValue);
}
}
return text_nodes
SCRIPT
option_text = driver.execute_script(script, checkbox_div)
#Tidy up the text nodes to get rid of blanks and extra white space
option_text.collect!(&:strip).delete_if(&:empty?)
p option_text
#=> ["foo", "bar gui", "bab"]
#Combine the two arrays to create a hash (with key being the option value)
option_hash = Hash[*option_values.zip(option_text).flatten]
p option_hash
#=> {"5418630"=>"foo", "6360899"=>"bar gui", "9556609"=>"bab"}
I have an html like this:
...
<table>
<tbody>
...
<tr>
<th> head </th>
<td> td1 text<td>
<td> td2 text<td>
...
</tr>
</tbody>
<tfoot>
</tfoot>
</table>
...
I'm using Nokogiri with ruby. I want traverse through each row and get the text of th and corresponding td into an hash.
require "nokogiri"
#Parses your HTML input
html_data = "...stripped HTML markup code..."
html_doc = Nokogiri::HTML html_data
#Iterates over each row in your table
#Note that you may need to clarify the CSS selector below
result = html_doc.css("table tr").inject({}) do |all, row|
#Modify if you need to collect only the first td, for example
all[row.css("th").text] = row.css("td").text
end
I didn't run this code, so I'm not absolutely sure but the overall idea should be right:
html_doc = Nokogiri::HTML("<html> ... </html>")
result = []
html_doc.xpath("//tr").each do |tr|
hash = {}
tr.children.each do |node|
hash[node.node_name] = node.content
end
result << hash
end
puts result.inspect
See the docs for more info: http://nokogiri.org/Nokogiri/XML/Node.html