pageobject - when_visible for all elements - ruby

I am using a combination of cucumber and pageobject to test my web application. Sometimes, the script tries to click an element even before the page that contains the element starts loading. (I confirmed this by capturing the screenshots of failing scenarios)
This inconsistency is not wide-spread and it happens repeatedly only for a few elements. Instead of directly accessing those elements, if I do example_element.when_visible.click, the test suite always passes.
As of now, I click a link using link_name (generated by pageobject module on calling link(:name, identifier: {index: 0}, &block)
I would like to not edit the above mentioned snippet, but act as if i called link_name_element.when_visible.click. The reason is, the test suite is pretty large and it would be tedious to change all the occurences and I also believe that the functionality is already present and somehow I don't see it anywhere. Can anybody help me out?!

This seems solution seems quite hacky and may not be considering some edge cases. However, I will share it since there are no other answers yet.
You can add the following monkey patch assuming that you are using watir-webdriver. This would be added after you require page-object.
require 'watir-webdriver'
require 'page-object'
module PageObject
module Platforms
module WatirWebDriver
class PageObject
def find_watir_element(the_call, type, identifier, tag_name=nil)
identifier, frame_identifiers, wait = parse_identifiers(identifier, type, tag_name)
the_call, identifier = move_element_to_css_selector(the_call, identifier)
if wait
element = #browser.instance_eval "#{nested_frames(frame_identifiers)}#{the_call}.when_present"
else
element = #browser.instance_eval "#{nested_frames(frame_identifiers)}#{the_call}"
end
switch_to_default_content(frame_identifiers)
type.new(element, :platform => :watir_webdriver)
end
def process_watir_call(the_call, type, identifier, value=nil, tag_name=nil)
identifier, frame_identifiers, wait = parse_identifiers(identifier, type, tag_name)
the_call, identifier = move_element_to_css_selector(the_call, identifier)
if wait
modified_call = the_call.dup.insert(the_call.rindex('.'), '.when_present')
value = #browser.instance_eval "#{nested_frames(frame_identifiers)}#{modified_call}"
else
value = #browser.instance_eval "#{nested_frames(frame_identifiers)}#{the_call}"
end
switch_to_default_content(frame_identifiers)
value
end
def parse_identifiers(identifier, element, tag_name=nil)
wait = identifier.has_key?(:wait) ? false : true
identifier.delete(:wait)
frame_identifiers = identifier.delete(:frame)
identifier = add_tagname_if_needed identifier, tag_name if tag_name
identifier = element.watir_identifier_for identifier
return identifier, frame_identifiers, wait
end
end
end
end
end
Basically, the intent of this patch is that the Watir when_present method is always called. For example, your page object call will get translated to Watir as browser.link.when_present.click. In theory, it should get called for any method called on a page object element.
Unfortunately, there is a catch. There are some situations where you probably do not want to wait for the element to become present. For example, when doing page.link_element.when_not_visible, you would not want to wait for the element to appear before checking that it does not appear. In these cases, you can force the standard behaviour of not waiting by including :wait => false in the element locator:
page.link_element(:wait => false).when_not_visible

Related

Looking for a cleaner way to scrape from website by avoiding repeating

Hi I am just doing a bit of refactoring on a small cli web scraping project I did in Ruby and I was simply wondering if there was cleaner way to write a particular section without repeating the code too much.
Basically with the code below, I pulled data from a website but I had to do this per page. You will notice that both methods are only different by their name and the source.
def self.scrape_first_page
html = open("https://www.texasblackpages.com/united-states/san-antonio")
doc = Nokogiri::HTML(html)
doc.css('div.grid_element').each do |business|
biz = Business.new
biz.name = business.css('a b').text
biz.type = business.css('span.hidden-xs').text
biz.number = business.css('span.sm-block.lmargin.sm-nomargin').text.gsub("\r\n","").strip
end
end
def self.scrape_second_page
html = open('https://www.texasblackpages.com/united-states/san-antonio?page=2')
doc = Nokogiri::HTML(html)
doc.css('div.grid_element').each do |business|
biz = Business.new
biz.name = business.css('a b').text
biz.type = business.css('span.hidden-xs').text
biz.number = business.css('span.sm-block.lmargin.sm-nomargin').text.gsub("\r\n","").strip
end
end
Is there a way for me to streamline this process all with just one method pulling from one source, but have the ability to access different pages within the same site, or this is pretty much the best and only way? They owners of the website do not have a public api from me to pull from in case anyone is wondering.
Remember that in programming you want to steer towards code that follows the Zero, One or Infinity Rule avoid the dreaded two. In other words, write methods that take no arguments, fixed arguments (one), or an array of unspecified size (infinity).
So the first step is to clean up the scraping function to make it as generic as possible:
def scrape(page)
doc = Nokogiri::HTML(open(page))
# Use map here to return an array of Business objects
doc.css('div.grid_element').map do |business|
Business.new.tap do |biz|
# Use tap to modify this object before returning it
biz.name = business.css('a b').text
biz.type = business.css('span.hidden-xs').text
biz.number = business.css('span.sm-block.lmargin.sm-nomargin').text.gsub("\r\n","").strip
end
end
end
Note that apart from the extraction code, there's nothing specific about this. Takes a URL, returns Business objects in an Array.
In order to generate pages 1..N, consider this:
def pages(base_url, start: 1)
page = start
Enumerator.new do |y|
loop do
y << base_url % page
page += 1
end
end
end
Now that's an infinite series, but you can always cap it to whatever you want with take(n) or by instead looping until you get an empty list:
# Collect all business from each of the pages...
businesses = pages('https://www.texasblackpages.com/united-states/san-antonio?page=%d').lazy.map do |page|
# ...by scraping the page...
scrape(page)
end.take_while do |results|
# ...and iterating until there's no results, as in Array#any? is false.
results.any?
end.to_a.flatten
The .lazy part means "evaluate each part of the chain sequentially" as opposed to the default behaviour of trying to evaluate each stage to completion. This is important or else it will try and download an infinite number of pages before moving to the next test.
The .to_a on the end forces that chain to run to completion. The .flatten squishes all the page-wise results into a single result set.
Of course if you want to scrape the first N pages, it's a lot easier:
pages('https://www.texasblackpages.com/.../san-antonio?page=%d').take(n).flat_map do |page|
scrape(page)
end
It's almost no code!
This was suggested by #Todd A. Jacobs
def self.scrape(url)
html = open(url)
doc = Nokogiri::HTML(html)
doc.css('div.grid_element').each do |business|
biz = Business.new
biz.name = business.css('a b').text
biz.type = business.css('span.hidden-xs').text
biz.number = business.css('span.sm-block.lmargin.sm-nomargin').text.gsub("\r\n","").strip
end
The downside is with there not being a public api I had to invoke the method as many times as I need it since the url's are representing different pages within the wbesite, but this is fine because I was able to get rid of the repeating methods.
def make_listings
Scraper.scrape("https://www.texasblackpages.com/united-states/san-antonio")
Scraper.scrape("https://www.texasblackpages.com/united-states/san-antonio?page=2")
Scraper.scrape("https://www.texasblackpages.com/united-states/san-antonio?page=3")
Scraper.scrape("https://www.texasblackpages.com/united-states/san-antonio?page=4")
end
i ever had some problem with you, i do loop though. usually if the page support pagination then the first page it have chance to use query param page also.
def self.scrape
page = 1
loop do
url = "https://www.texasblackpages.com/united-states/san-antonio?page=#{page}"
# do nokogiri parse
# do data scrapping
page += 1
end
end
you can have break on certain page condition.

How to use user input across classes in Ruby?

I’m writing an app that scrapes genius.com to show a user the top ten songs. The user can then pick a song to see the lyrics.
I’d like to know how to employ the user input collected in my cli class inside of a method in my scraper class.
Right now I have part of the scrape happening outside the scraper class, but I'd like a clean division of responsibility.
Here’s part of my code:
Class CLI
def get_user_song
chosen_song = gets.strip.to_i
if chosen_song > 10 || chosen_song < 1
puts "Only the hits! Choose a number from 1-10."
end
I’d like to be able to do something like the below.
Class Scraper
def self.scrape_lyrics
page = Nokogiri::HTML(open("https://genius.com/#top-songs"))
#url = page.css('div#top-songs a').map {|link| link['href']}
user_selection = #input_from_cli #<---this is where I'd like to use the output
# of the 'gets' method above.
#print_lyrics = #url[user_selection - 1]
scrape_2 = Nokogiri::HTML(open(#print_lyrics))
puts scrape_2.css(".lyrics").text
end
I'm basically wondering how I can pass the chosen song variable into the Scraper class. I've tried a writing class method, but was having trouble writing it in a way that didn't break the rest of my program.
Thanks for any help!
I see two possible solutions to your problem. Which one is appropriate for this depends on your design goals. I'll try to explain with each option:
From a plain reading of your code, the user inputs the number without seeing the content of the page (through your program). In this case the simple way would be to pass in the selected number as a parameter to the scrape_lyrics method:
def self.scrape_lyrics(user_selection)
page = Nokogiri::HTML(open("https://genius.com/#top-songs"))
#url = page.css('div#top-songs a').map {|link| link['href']}
#print_lyrics = #url[user_selection -1]
scrape_2 = Nokogiri::HTML(open(#print_lyrics))
puts scrape_2.css(".lyrics").text
end
All sequencing happens in the CLI class and the scraper is called with all necessary data at the get go.
When imagining your tool more interactively, I was thinking it could be useful to have the scraper download the current top 10 and present the list to the user to choose from. In this case the interaction is a little bit more back-and-forth.
If you still want a strict separation, you can split scrape_lyrics into scrape_top_ten and scrape_lyrics_by_number(song_number) and sequence that in the CLI class.
If you expect the interaction flow to be very dynamic it might be better to inject the interaction methods into the scraper and invert the dependency:
def self.scrape_lyrics(cli)
page = Nokogiri::HTML(open("https://genius.com/#top-songs"))
titles = page.css('div#top-songs h3:first-child').map {|t| t.text}
user_selection = cli.choose(titles) # presents a choice to the user, returning the selected number
#url = page.css('div#top-songs a').map {|link| link['href']}
#print_lyrics = #url[user_selection - 1]
scrape_2 = Nokogiri::HTML(open(#print_lyrics))
puts scrape_2.css(".lyrics").text
end
See the tty-prompt gem for an example implementation of the latter approach.

Ruby PageObject Design for Similar Page Sections

I'm using the Cheezy Page Object gem (which also means I'm using Watir, which also means I'm using Selenium). I also have the watir gem explicitly loaded.
Anyway I have a site I am modeling with the UI written in angular where there is 1 page whose contents change based on dropdown selection. The page has several sections but it is visibly the same for each dropdown choice. The only difference is the xpath locators I am using to get there (there's no unique ID on the sections).
So for example I have an xpath like html/body/div[1]/div/div[1]/div/**green**/div/div[1]
and another like
html/body/div[1]/div/div[1]/div/**red**/div/div[1]
The elements on the sections strangely all have the same ID attribute and same class name. So I've been using xpath for the elements since that appears to make it a unique locator.
Problem is there are currently seven dropdown choices each with several sections like this. And they have visibly same elements and structure (from end user perspective) but when you look at html the only difference is the locator so like this for the elements:
html/body/div[1]/div/div[1]/div/green/div/div[1]/**<element>**
and another like
html/body/div[1]/div/div[1]/div/red/div/div[1]/**<element>**
In my current design I have created one page and created page sections for each section on a page. Multiply the number of page sections with number of dropdown choices and you see it is alot. Some of the choices do generate extra elements but there are still common elements between all sections. I also have to duplicate all of these elements across the seven different pages because the xpath is different. Is there some way for me to pass some initializer to the PageObject page_section like the type-a or type-b string and then based on that I can also choose correct xpath for all elements?
So like if I have text field like so in like a base page object page_section:
text_field(:team, xpath: "...#{type_variable}")
Can I do something like section = SomePageObject.page_section_name(type_variable)?
EDIT: Adding Page Object code per request
class BasePO
include PageObject
#Option S1 Cards
page_section(:options_red_card, OptionRedCard, xpath: "/html/body/app-component/app-page/div[2]/div/div/div[1]/div/div/div/ngb-tabset/div/div/red/div[2]/div[2]/div/div/div/div")
page_section(:options_green_card, OptionGreenCard, xpath: "/html/body/app-component/app-page/div[2]/div/div/div[1]/div/div/div/ngb-tabset/div/div/green/div[2]/div[2]/div/div/div/div")
page_section(:options_yellow_card, OptionYellowCard, xpath: "/html/body/app-component/app-page/div[2]/div/div/div[1]/div/div/div/ngb-tabset/div/div/yellow/div[2]/div[2]/div/div/div/div")
#Detail S2 Cards
page_section(:detail_red_card, DetailRedCard, xpath: "/html/body/app-component/app-page/div[2]/div/div/div[1]/div/div/div/ngb-tabset/div/div/red/div[1]/div/div/div")
page_section(:detail_green_card, DetailGreenCard, xpath: "/html/body/app-component/app-page/div[2]/div/div/div[1]/div/div/div/ngb-tabset/div/div/green/div[1]/div/div/div")
page_section(:detail_yellow_card, DetailYellowCard, xpath: "/html/body/app-component/app-page/div[2]/div/div/div[1]/div/div/div/ngb-tabset/div/div/yellow/div[1]/div/div/div")
end
EDIT2: Adding page_section content per request. All Option Cards share these elements at a minimum. Different elements in the Detail Cards but same structure as Option Cards.
class OptionRedCard
include PageObject
def field1_limit
text_field_element(xpath: "/html/body/app-component/app-page/div[2]/div/div/div[1]/div/div/div/ngb-tabset/div/div/red-unit/div[2]/div[2]/div/div/div/red/form/div/div/div/table/tbody/tr[2]/td[2]/div/div[1]/div/currency/div/input")
end
def field1_agg
text_field_element(xpath: "/html/body/app-component/app-page/div[2]/div/div/div[1]/div/div/div/ngb-tabset/div/div/red-unit/div[2]/div[2]/div/div/div/red/form/div/div/div/table/tbody/tr[2]/td[2]/div/div[2]/div/currency/div/input")
end
def field2_limit
text_field_element(xpath: "/html/body/app-component/app-page/div[2]/div/div/div[1]/div/div/div/ngb-tabset/div/div/red-unit/div[2]/div[2]/div/div/div/red/form/div/div/div/table/tbody/tr[3]/td[2]/div/div[1]/div/currency/div/input")
end
def field2_agg
text_field_element(xpath: "/html/body/app-component/app-page/div[2]/div/div/div[1]/div/div/div/ngb-tabset/div/div/red-unit/div[2]/div[2]/div/div/div/red/form/div/div/div/table/tbody/tr[3]/td[2]/div/div[2]/div/currency/div/input")
end
def field3_limit
text_field_element(xpath: "/html/body/app-component/app-page/div[2]/div/div/div[1]/div/div/div/ngb-tabset/div/div/red-unit/div[2]/div[2]/div/div/div/red/form/div/div/div/table/tbody/tr[4]/td[2]/div/div[1]/div/currency/div/input")
end
def field3_agg
text_field_element(xpath: "/html/body/app-component/app-page/div[2]/div/div/div[1]/div/div/div/ngb-tabset/div/div/red-unit/div[2]/div[2]/div/div/div/red/form/div/div/div/table/tbody/tr[4]/td[2]/div/div[2]/div/currency/div/input")
end
def field1_agg_value
field1_agg.attribute_value('data-value')
end
def field2_agg_value
field2_agg.attribute_value('data-value')
end
def field3_agg_value
field3_agg.attribute_value('data-value')
end
end
I think the short answer to your question, is no, there is no built-in support for passing a value to the page sections. However, here are some alternatives I can think of.
Option 1 - Use initialize_accessors
Usually the accessors are executed at compile time. However, you could use the #initialize_accessors method to defer the execution until the initialization of the page object (or section). This would let you define your accessors in a base class that, at initialization, inserts color type into the paths:
class BaseCard
include PageObject
def initialize_accessors
# Accessors defined with placeholder for the color type
self.class.text_field(:field1_limit, xpath: "/html/body/some/path/#{color_type}/more/path/input")
end
end
# Each card class would define its color for substitution into the accessors
class OptionRedCard < BaseCard
def color_type
'red'
end
end
class OptionGreenCard < BaseCard
def color_type
'green'
end
end
class BasePO
include PageObject
page_section(:options_red_card, OptionRedCard, xpath: '/html/body/path')
page_section(:options_green_card, OptionGreenCard, xpath: '/html/body/path')
end
Option 2 - Using relative paths
My suggested approach would be to use relative paths such that the color can be removed from the path of the page section. From the objects provided, you might be able to do something like:
class OptionCard
include PageObject
element(:unit) { following_sibling(tag_name: "#{root.tag_name}-unit") }
div(:field1_limit) { unit_element.tr(index: 1).text_field(index: 0) }
div(:field1_agg) { unit_element.tr(index: 1).text_field(index: 1) }
div(:field2_limit) { unit_element.tr(index: 2).text_field(index: 0) }
div(:field2_agg) { unit_element.tr(index: 2).text_field(index: 1) }
end
class BasePO
include PageObject
# Page sections only defined to the top most element of the section (the color element)
page_section(:options_red_card, OptionCard, xpath: "/html/body/app-component/app-page/div[2]/div/div/div[1]/div/div/div/ngb-tabset/div/div/red")
page_section(:options_green_card, OptionCard, xpath: "/html/body/app-component/app-page/div[2]/div/div/div[1]/div/div/div/ngb-tabset/div/div/green")
end

Define variable by elements class even not displayed on current page

I'm using Ruby and Selenium to get some data from a page. I want to define variable with driver.find_element, but element is not currently visible on page.
next = driver.find_element(:class, 'right')
It returns Selenium::WebDriver::Error::NoSuchElementError
It works fine when element is present.
Any solutions?
Thank you!
Selenium works by executing Javascript commands. By using find_element it will search for the element on the DOM. If it cannot find it you will get the error you are getting. After all if an element is not on the DOM it cannot be found.
The real question is why do you want too find an element that is not currently present on the DOM? You can't do anything with somehing that doesn't exist.
All I could think of is that the element becomes present after the DOM has been loaded due to Javascript not being fully executed yet. If that is the case you can use a WebDriver::Wait to try and find the element for a certain amount of time.
A small example:
wait = Selenium::WebDriver::Wait.new(:timeout => 10) # seconds
begin
element = wait.until { driver.find_element(:id => "some-dynamic-element") }
ensure
driver.quit
end
Edit to include try-catch example:
begin
next = driver.find_element(:class, 'right')
# Code for when element is found here
rescue NoSuchElementError
# Code for when element is not found here
end

Use `send` to check the existence of an element

This code is supposed to take a string as input to check if an element of page_object is present. The script has to raise an exception in case it discovers the element, and do nothing if it doesn't.
Example Page Object:
span(:partner_flag, class: 'content-partner-flag')
The script:
def check_element_not_exist(page_object)
page_object = page_object.downcase.gsub(' ', '_')
option = send("#{page_object}")
if option.exists?
raise "#{page_object} was not found!"
end
end
In this case, I use the string partner_flag to feed the function and check the element. Watir fails in the line:
option = send("#{page_object}")
because it needs to find that element in the webpage in order to define option. Is there an alternate way of defining option, or a different way of making this non-existence check with the send functionality?
The accessor methods create a method for checking if an element exists.
For example, when you include:
span(:partner_flag, class: 'content-partner-flag')
Then the method:
partner_flag?
Is created that returns true if the element exists and false if the element does not.
You could call this method in the check_element_not_exist method:
def check_element_not_exist(page_object)
page_object = page_object.downcase.gsub(' ', '_')
exists = send("#{page_object}?")
if exists
raise "#{page_object} was found!"
end
end

Resources