How can I tab dynamically with capybara? - ruby

I am testing a web app with multiple dynamic rows. With nothing to scope and grab in the vicinity. I get to the particular field by grabbing something I can id, and tabbing to the text box or selector I wish to manipulate.
It looks like this...
editor = page.find_by_id('grabbable')
editor.native.send_keys(:tab, :tab, "Hello World")
What I'd like to do is something like...
tab_amount = tabs(2)
editor = page.find_by_id('grabbable')
editor.native.send_keys(tab_amount, "Hello World")
...
def tabs(amount)
tab_object = :tab
while amount > 1
tab_object = tab_object + :tab
amount = amount - 1
end
return tab_amount
end
Is such a dynamic tab possible?

what about something like
def tabs(amount)
tab_object = Array.new(amount, :tab)
end
editor.native.send_keys(*tabs(3), "Hello World")
some info on splat here
http://www.ruby-doc.org/core-2.0/doc/syntax/calling_methods_rdoc.html#label-Array+to+Arguments+Conversion

Here is what I ended up doing...
def autotab(amount)
tab = Array.new
amount.times do
tab << :tab
end
return tab
end

Related

Web scraping with Kimurai gem

I am doing some web scraping with the Kimurai Ruby gem. I have this script that works great:
require 'kimurai'
class SimpleSpider < Kimurai::Base
#name = "simple_spider"
#engine = :selenium_chrome
#start_urls = ["https://apply.workable.com/taxjar/"]
def parse(response, url:, data: {})
# Update response to current response after interaction with a browser
count = 0
# browser.click_button "Show more"
doc = browser.current_response
returned_jobs = doc.css('.careers-jobs-list-styles__jobsList--3_v12')
returned_jobs.css('li').each do |char_element|
# puts char_element
title = char_element.css('a')[0]['aria-label']
link = "https://apply.workable.com" + char_element.css('a')[0]['href']
#click on job link and get description
browser.visit(link)
job_page = browser.current_response
description = job_page.xpath('/html/body/div[1]/div/div[1]/div[2]/div[2]/div[2]').text
puts '*******'
puts title
puts link
puts description
puts count += 1
end
puts "There are #{count} jobs total"
end
end
SimpleSpider.crawl!
However, I'm wanting this all to return an array of objects...or jobs in this case. I'd like to create a jobs array in the parse method and do something like jobs << [title, link, description, company] inside the returned_jobs loop and have that get returned when I call SimpleSpider.crawl! but that doesn't work.
Any help appreciated.
You can slightly modify your code like this:
class SimpleSpider < Kimurai::Base
#name = "simple_spider"
#engine = :selenium_chrome
#start_urls = ["https://apply.workable.com/taxjar/"]
def parse(response, url:, data: {})
# Update response to current response after interaction with a browser
count = 0
# browser.click_button "Show more"
doc = browser.current_response
returned_jobs = doc.css('.careers-jobs-list-styles__jobsList--3_v12')
jobs = []
returned_jobs.css('li').each do |char_element|
# puts char_element
title = char_element.css('a')[0]['aria-label']
link = "https://apply.workable.com" + char_element.css('a')[0]['href']
#click on job link and get description
browser.visit(link)
job_page = browser.current_response
description = job_page.xpath('/html/body/div[1]/div/div[1]/div[2]/div[2]/div[2]').text
jobs << [title, link, description]
end
puts "There are #{jobs.count} jobs total"
puts jobs
end
end
I am not sure about the company as I don't see that variable in your code. However, you can see the idea to call an array above and work on that.
Here is part of output running in terminal:
I also have a blog post here about how to use Kimurai framework from Ruby on Rails application.
Turns out there is a parse method that allows a value to be returned. Here is working example:
require 'open-uri'
require 'nokogiri'
require 'kimurai'
class TaxJar < Kimurai::Base
#name = "tax_jar"
#engine = :selenium_chrome
#start_urls = ["https://apply.workable.com/taxjar/"]
def parse(response, url:, data: {})
jobs = Array.new
doc = browser.current_response
returned_jobs = doc.css('.careers-jobs-list-styles__jobsList--3_v12')
returned_jobs.css('li').each do |char_element|
title = char_element.css('a')[0]['aria-label']
link = "https://apply.workable.com" + char_element.css('a')[0]['href']
#click on job link and get description
browser.visit(link)
job_page = browser.current_response
description = job_page.xpath('/html/body/div[1]/div/div[1]/div[2]/div[2]/div[2]').text
company = 'TaxJar'
puts "title is: #{title}, link is: #{link}, \n description is: #{description}"
jobs << [title, link, description, company]
end
return jobs
end
end
jobs = TaxJar.parse!(:parse, url: "https://apply.workable.com/taxjar/")
puts jobs.inspect
If you are scraping JS websites, this gem seems pretty robust compared with others (waitr/selenium) I have tried.

How to write a while loop properly

I'm trying to scrape a website however I cannot seem to get my while-loop to break out once it hits a page with no more information:
def scrape_verse_items(keyword)
pg = 1
while pg < 1000
puts "page #{pg}"
url = "https://www.bible.com/search/bible?page=#{pg}&q=#{keyword}&version_id=1"
doc = Nokogiri::HTML(open(url))
items = doc.css("ul.search-result li.reference")
error = doc.css('div#noresults')
until error.any? do
if keyword != ''
item_hash = {}
items.each do |item|
title = item.css("h3").text.strip
content = item.css("p").text.strip
item_hash[title] = content
end
else
puts "Please enter a valid search"
end
if error.any?
break
end
end
pg += 1
end
item_hash
end
puts scrape_verse_items('joy')
I know this doesn't exactly answer your question, but perhaps you might consider using a different approach altogether.
Using while and until loops can get a bit confusing, and usually isn't the most performant way of doing things.
Maybe you would consider using recursion instead.
I've written a small script that seems to work :
class MyScrapper
def initialize;end
def call(keyword)
puts "Please enter a valid search" && return unless keyword
scrape({}, keyword, 1)
end
private
def scrape(results, keyword, page)
doc = load_page(keyword, page)
return results if doc.css('div#noresults').any?
build_new_items(doc).merge(scrape(results, keyword, page+1))
end
def load_page(keyword, page)
url = "https://www.bible.com/search/bible?page=#{page}&q=#{keyword}&version_id=1"
Nokogiri::HTML(open(url))
end
def build_new_items(doc)
items = doc.css("ul.search-result li.reference")
items.reduce({}) do |list, item|
title = item.css("h3").text.strip
content = item.css("p").text.strip
list[title] = content
list
end
end
end
You call it by doing MyScrapper.new.call("Keyword") (It might make more sense to have this as a module you include or even have them as class methods to avoid the need to instantiate the class.
What this does is, call a method called scrape and you give it the starting results, keyword, and page. It loads the page, if there are no results it returns the existing results it has found.
Otherwise it builds a hash from the page it loaded, and then the method calls itself, and merges the results with the new hash it just build. It does this till there are no more results.
If you want to limit the page results you can just change this like:
return results if doc.css('div#noresults').any?
to this:
return results if doc.css('div#noresults').any? || page > 999
Note: You might want to double-check the results that are being returned are correct. I think they should be but I wrote this quite quickly, so there could always be a small bug hiding somewhere in there.

How to reuse captured block in Slim and avoid duplicates?

Look at this code snippet:
require 'slim'
SLIM = <<-SLIM
- column do
= 'Text '
SLIM
def column(&block)
$column = block
end
#########
template = Slim::Template::new { SLIM }
template.render(self)
p $column.call
p $column.call
p $column.call
As you can see I have captured block (it render 'Text ' string) to $column global variable and call it 3 times. I expect that will be printed:
"Text "
"Text "
"Text "
but instead I see:
"Text "
"Text Text "
"Text Text Text "
How to capture block and avoid duplicates?
I think this is because you are passing block with = 'Text ' value, and = in Slim is accumulating values, that's why you get incremented string
Why you can't just call template.render(self) multiple times?
require 'slim'
SLIM = <<-SLIM
- column do
= 'Text '
SLIM
def column(&block)
block.call
end
#########
template = Slim::Template::new { SLIM }
p template.render(self)
p template.render(self)
p template.render(self)
Try p #{yield} directly 3 times if ypu are using slim without framework.

Refreshing a view in Gtk3-Ruby

I'm having a problem changing a view in my feed reader. When a button in the feed list is clicked, the feed window is supposed to update. Instead, the feed window stays empty. How do you remove and replace a widget in gtk3-ruby?
The problem method:
def feed=(feed)
#feed.destroy()
#title, #count = feed.channel.title, feed.items.size
#label.set_markup "<b>#{#title} (#{#count} articles)</b>"
#feed = FeedItems.new(feed.items, #parent)
self.pack_end(#feed)
#feed.show()
end
The full source is on pastebin:
http://pastebin.com/KPKAfCmx
I should have used show_all and the widget updates.
def feed=(feed)
self.remove(#feed)
#title, #count = feed.channel.title, feed.items.size
#label.set_markup "<b>#{#title} (#{#count} articles)</b>"
#feed = FeedItems.new(feed.items, #parent)
self.pack_end(#feed)
self.show_all
end

Using Ruby to wait for a page to load an element

I currently have a working piece of Ruby that looks like this:
def error_message
browser.span(:id => 'ctl00_cphMainContent_lblMessage').wait_until_present(30) do
not errors.empty?
end
errors
end
However, I'd prefer something more like this:
span(:errors, :id => 'ctl00_cphMainContent_lblMessage')
##farther down##
def error_message
browser.errors.wait_until_present(30) do
etc...
I'm new to Ruby, but how can I do something like this, assuming it's possible?
Typically, you make use of the Watir::Wait.until or <element>.wait_until_present methods.
In this way, you could do something like:
# Earlier in code
#browser = Watir::Browser.start('http://mypage.com/')
# ...
errors = { my_first_error: "You forgot your keys in the car!" }
check_for_errors(error[:my_first_error])
# Wherever
def check_for_errors(error, expiry=30)
error_element = #browser.span(:id => 'ctl00_cphMainContent_lblMessage')
error_element(value: error).wait_until_present(expiry)
end
See the watir documentation for more information.

Resources