How can I continuously open websites using Watir? - ruby

I have an array of url strings (i.e. "http://www.cnn.com") which I want to iterate through and open in Safari using watir.
urlArray.each do |url|
browser.goto(url)
end
will open the first page, but it never proceeds to the next pages in the array.
Any ideas on what's going on?

This worked for me, it opened both Google and Yahoo.
require "rubygems"
require "safariwatir"
urlArray = ["http://google.com", "http://yahoo.com"]
browser = Watir::Safari.new
urlArray.each do |url|
browser.goto url
end
When I added "http://www.cnn.com" to urlArray
urlArray = ["http://www.cnn.com", "http://google.com", "http://yahoo.com"]
it opened just cnn.com, so the problem is at that page.

Related

Full-page screenshot with URL in ruby ​and watir

With the code below, I get a screenshot of only one part of the screen, but I would like to take a full screenshot, which shows the URL of the page as well.
Is it possible?
AfterStep do
encoded_img = #browser.driver.screenshot_as(:base64)
embed("data:image/png;base64,#{encoded_img}",'image/png')
end
Watir doesn't have this provision to capture the screenshot along with a URL. But we can use win32ole to do this.
require 'win32/screenshot'
Win32::Screenshot::Take.of(:desktop).write(image_path)
In my case for capturing the full-screenshot with a URL, I do the following
# Code to capture the full-page screenshot with a URL
require 'watir'
require 'win32ole'
require 'win32/screenshot'
# Launch a browser and navigate to the page
browser = Watir::Browser.new :ie
browser.goto "https://www.google.com"
win_title = browser.title #Fetch the Title
# Use AutoIt to get the focus of the browser to front
WIN32OLE.new("AutoItX3.Control").ControlFocus(win_title, "", "")
# Capture the screen shot of the desktop
sleep 2 # Hold for 2s
image_path = "image_path#{rand(10000)}.png"
Win32::Screenshot::Take.of(:desktop).write(image_path)
In my case for capturing the Full-screenshot with URL, I do the following
#Code to capture the Full-Page screenshot with URL
require 'watir'
require 'win32ole'
require 'win32/screenshot'
#Launch Browser and navigate to page
browser = Watir::Browser.new :ie
browser.goto "https://www.google.com"
#Fetch the Title
win_title = browser.title
#Use AutoIt to get the focus of the browser to front
WIN32OLE.new("AutoItX3.Control").ControlFocus(win_title, "", "")
#Capture the screen shot of the desktop
sleep 2 # Hold for 2s
image_path = "image_path#{rand(10000)}.png"
Win32::Screenshot::Take.of(:desktop).write(image_path)

Mechanize scraping google urls

I have a program that searches google using either a key word or keywords that are taken as a parameter while running the program:
example: pull_sites.rb "testing"
returns these sites >>>
https://en.wikipedia.org/wiki/Software_testing
http://en.wikipedia.org/wiki/Test_automation
http://www.istqb.org/about-istqb.html
http://softwaretestingfundamentals.com/test-plan/
https://en.wikipedia.org/wiki/Software_testing
http://webcache.googleusercontent.com/search%3Fhl%3Den%26biw%26bih%26q%3Dcache:9qU2GDLzZzEJ:https://en.wikipedia.org/wiki/Software_testing%252Btesting%26gbv%3D1%26%26ct%3Dclnk
https://en.wikipedia.org/wiki/Test_strategy
https://en.wikipedia.org/wiki/Category:Software_testing
https://en.wikipedia.org/wiki/Test_automation
https://en.wikipedia.org/wiki/Portal:Software_testing
https://en.wikipedia.org/wiki/Test
http://webcache.googleusercontent.com/search%3Fhl%3Den%26biw%26bih%26q%3Dcache:R94CAo00wOYJ:https://en.wikipedia.org/wiki/Test%252Btesting%26gbv%3D1%26%26ct%3Dclnk
https://en.wikipedia.org/wiki/Unit_testing
http://webcache.googleusercontent.com/search%3Fhl%3Den%26biw%26bih%26q%3Dcache:G9V8uRLkPjIJ:https://en.wikipedia.org/wiki/Unit_testing%252Btesting%26gbv%3D1%26%26ct%3Dclnk
https://testing.byu.edu/
http://webcache.googleusercontent.com/search%3Fhl%3Den%26biw%26bih%26q%3Dcache:d9bGrCHr9fsJ:https://testing.byu.edu/%252Btesting%26gbv%3D1%26%26ct%3Dclnk
https://www.test.com/
http://webcache.googleusercontent.com/search%3Fhl%3Den%26biw%26bih%26q%3Dcache:S92tylTr1V8J:https://www.test.com/%252Btesting%26gbv%3D1%26%26ct%3Dclnk
http://ddce.utexas.edu/disability/using-testing-accommodations/
http://blogs.vmware.com/virtualblocks/2015/07/06/vsan-vs-nutanix-head-to-head-performance-testing-part-4-exchange/
http://www.networkforgood.com/nonprofitblog/testing-101-4-steps-optimizing-your-fundraising-approach/
http://www.auslea.com/software-testing-training.html
http://academy.littletonpublicschools.net/Default.aspx%3Ftabid%3D12807%26articleType%3DArticleView%26articleId%3D2400
https://golang.org/pkg/testing/
http://webcache.googleusercontent.com/search%3Fhl%3Den%26biw%26bih%26q%3Dcache:EALG7Jlm9eoJ:https://golang.org/pkg/testing/%252Btesting%26gbv%3D1%26%26ct%3Dclnk
http://www.speedtest.net/
http://webcache.googleusercontent.com/search%3Fhl%3Den%26biw%26bih%26q%3Dcache:M47_v0xF3m8J:http://www.speedtest.net/%252Btesting%26gbv%3D1%26%26ct%3Dclnk
https://www.act.org/content/act/en/products-and-services/the-act/taking-the-test.html
http://webcache.googleusercontent.com/search%3Fhl%3Den%26biw%26bih%26q%3Dcache:1sMSoJBXydoJ:https://www.act.org/content/act/en/products-and-services/the-act/taking-the-test.html%252Btesting%26gbv%3D1%26%26ct%3Dclnk
http://www.act.org/content/act/en/products-and-services/the-act/test-preparation.html
http://webcache.googleusercontent.com/search%3Fhl%3Den%26biw%26bih%26q%3Dcache:pAzlNJl3YY4J:http://www.act.org/content/act/en/products-and-services/the-act/test-preparation.html%252Btesting%26gbv%3D1%26%26ct%3Dclnk
It works as expected but only scrapes the first page of google, is it possible to search say page 1-5?
Here's the source of the scrape:
def get_urls
puts "Searching...".green
agent = Mechanize.new
page = agent.get('http://www.google.com/')
google_form = page.form('f')
google_form.q = "#{SEARCH}" #SEARCH is the parameter given when program is run
page = agent.submit(google_form, google_form.buttons.first)
page.links.each do |link|
if link.href.to_s =~/url.q/
str=link.href.to_s
strList=str.split(%r{=|&})
url=strList[1]
File.open("links.txt", "a+"){ |s| s.puts(url) }
end
end
end
Ok if you are using google chrome or firefox, open up the developer tools. This will help you to identify the links you want to automate clicking. When you do a google search and then scroll to the bottom you will see the page links to click on. Using the developer tools in your browser you need to identify what class or id google is assigning these page number links. Then using mechanizes click method to follow these links. For example if the link is labelled "next" you can use something simple like:
page2 = page1.link_with(:text => "next").click
I'm answering from my phone so it may save you time to google "click a link" with mechanize for more details on it.
That's a GET form so much easier just to make the request yourself:
https://www.google.com/search?q=foo
https://www.google.com/search?q=foo&start=10
https://www.google.com/search?q=foo&start=20

Save image with watir-webdriver

How could i save image, which is loaded via watir-webdriver? All manuals and examples show only fetching src of image, and using open-uri saving it. But i need to save that image, which was generated when my page was loaded. How can i do this?
Could i use watir, and watir-webdriver at the same time? For example:
require 'watir-webdriver'
require 'watir'
#driver = Watir::Browser.new :firefox
#driver.goto (#base_url)
#img = #driver.image(id: 'CaptchaImage').save("2131.png")
How can i do something like this? Or else how to get it from cache?
Could anybody help me with it?
OpenURI will help you ..
require "watir-webdriver"
require "open-uri"
b = Watir::Browser.new :chrome
b.goto "http://stackoverflow.com/"
File.open("target_file.jpg", 'wb') do |f|
f.write open(b.img(:class, "sponsor-tag-img").src).read
end
Hope you are not doing anything bad.. :)
Please let me know if it helped.
require 'watir-webdriver'
And if you're doing it frequently, you can extend Watir's Image class with:
Watir::Image.class_eval do
def save path_to_new_file
#so that src can be opened without raising:
#Errno::ENOENT: No such file or directory # rb_sysopen
require 'open-uri'
open(path_to_new_file, 'wb') do |file|
file << open(src).read
end
end
end
so it can be used as follows:
browser = Watir::Browser.start 'google.com'
image = browser.img
image.save 'our_images/hi.png'
According to the watir-webdriver documentation, there is no Image#save method because it hasn't been implemented.
On the other hand, watir-classic does have a Image#save method. This is the example from the rdoc:
browser.image.save("c:/foo/bar.jpg")
This turned out to be a lil harder than it should be, but I needed to accomplish this since an image was only accessible when a valid session cookie is set. This is how I finally managed to accomplish this:
1. Install watir-extentions-element-screenshot
https://github.com/ansoni/watir-extensions-element-screenshot
You probably want to do gem install watir-extensions-element-screenshot.
2. Resize the browser window
This also works with headless phantomJS. After you initialize the browser, set the window size to something rather big to prevent a bug from happening when the image is larger than the browser window.
browser = Watir::Browser.new :phantomjs
browser.window.resize_to(1900, 1080)
3. Get the image element and screenshot it
In my case, the entire site is an image. Luckily, browser.html does show that the image is still encapsulated in an <img> tag, so we can access the image (in this example all images on the page) like so:
browser.elements( :tag_name => "img" ).each do |x|
x.screenshot("file_name.png")
end
This will save the image to file_name.png. It's not the exact same file, rather a screenshot of it. But as far as image download is concerned, this is a perfect solution for me and hopefully for others too!

Clicking group of links in watir

I am new to ruby, and I am trying to work with watir. I think I got the basics, but I am having trouble clicking all links whose id matches a regex. I tried this;
require "watir-webdriver"
browser = Watir::Browser.new :ff
browser.goto "http://mysite.com"
browser.links(:id, /asd[0-7]/).each do |adv|
adv.click
sleep 1
end
But it doesn't seem to be clicking the links. I am doing something wrong here? Links are opening in new windows, so looping through them is no problem. But I couldn't make the loop work.
This kind of investigation is better in IRB. Anyway, you should validate that you have links to click.
require "watir-webdriver"
browser = Watir::Browser.new :ff
browser.goto "https://rvm.io/"
links = browser.links(:href => /gemsets/)
links.count
I changed mine up to use a site I can access and has links.

stumped on clicking a link with nokogiri and mechanize

perhaps im doing it wrong, or there's another more efficient way. Here is my problem:
I first, using nokogiri open an html document and use its css to traverse the document until i find the link which i need to click.
Now once i have the link, how do i use mechanize to click it? According to the documentation, the object returned by Mechanize.new either the string or a Mechanize::Page::Link object.
I cannot use string - since there could be 100's of the same link - i only want mechanize to click the link that was traversed by nokogiri.
Any idea?
After you have found the link node you need, you can create the Mechanize::Page::Link object manually, and click it afterwards:
agent = Mechanize.new
page = agent.get "http://google.com"
node = page.search ".//p[#class='posted']"
Mechanize::Page::Link.new(node, agent, page).click
Easier way than #binarycode option:
agent = Mechanize.new
page = agent.get "http://google.com"
page.link_with(:class => 'posted').click
That is simple, you don't need to use mechanize link_with().click
You can just getthe link and update your page variable
Mechanize saves current working site internally, so it is smart enough to follow local links
Ex.:
agent = Mechanize.new
page = agent.get "http://somesite.com"
next_page_link = page.search('your exotic selectors here').first rescue nil #nokogyri object
next_page_href = next_page_link['href'] rescue nil # '/local/link/file.html'
page = agent.get(next_page_href) if next_page_href # goes to 'http://somesite.com/local/link/file.html'

Resources