How can I save a web page in Watir? - ruby

Using Ruby and Watir, can I save a web page the same way as doing a right mouse-click and " save page with name "?
I need to save the current web page from a script.

Yes, you can do it with watir. Just open a page and save the browser.html to any destination you want:
b = Watir::Browser.new :phantomjs # I am using phantomjs for scripted browsing
b.goto 'http://google.com'
File.open('/tmp/google', 'w') {|f| f.write b.html }

I don't know about watir, but I know the way to do it using Selenium Web Driver is using the page source method.
Check out the docs for that here:
http://selenium.googlecode.com/git/docs/api/rb/Selenium/WebDriver/Driver.html#page_source-instance_method
using this, you should get the whole source.
You can now save the source, by just creating a new file. I haven't tried this but you can check it out.
driver = Selenium::WebDriver.for(:firefox)
driver.get(URL_of_page_to_save)
file = File.new(filename, "w")
file.puts(driver.page_source)
file.close
Not sure if this saves all elements of the page.
Hope this helped a bit!

Related

Do not wait for page to finish loading in selenium

As the title states, I'm trying to create a script which opens multiple tabs in a browser. At the moment the script seems to wait until each page has finished loading before moving on to a new tab. Is there a way to move on without waiting for the page to load. It seems to be hard to find relevant information online.
#!/usr/bin/env ruby
require 'selenium-webdriver'
file = File.open(ARGV[0], 'r')
driver = Selenium::WebDriver.for :firefox
file.each do |host|
driver.get(host)
driver.execute_script( "window.open()" )
driver.switch_to.window( driver.window_handles.last )
end

Can I set Watir to not open a window?

Until recently, I was using mechanize to get a web page, then do some parsing with nokogiri. But because some content was loaded with Ajax after I have started using Watir instead. My code looks like this:
def get_page(url)
browser = Watir::Browser.start url
sleep 1
page = Nokogiri::HTML.parse(browser.html)
browser.close
return page
end
It works fine but since I am getting a lot of pages the browser.start will open a tons of windows. I found the close as you see, but is there a way to just not show the browser window at all?

Using Cucumber and Ruby to Download File

I've written some Ruby code (connected with Cucumber) that will go to a website and click a file that I'd like to download. The browser I'm using for this is Google Chrome.
Typically, when you go to download a file in Chrome, it doesn't ask for permission. However, when I run the code I made, it says:
"This type of file can harm your computer. Do you want to keep file_name.exe anyway?" It gives 2 options, "keep" or "discard". I have to click keep.
Obviously, you don't want all executables to just start downloading; however, this particular website/file should always be trustworthy.
Is there a command in Ruby or Cucumber that allows you to click the "keep" button automatically? This could just be a general "click at this pixel" or something. Or is there a way to mark a particular website in Chrome as safe. You can't inspect the element because it's not part of the website, but, instead, part of the browser. Preferably without having to download other software.
With this being said, this suggests that if it is possible, it should also be possible to automate an installation (as in clicking next -> next -> etc) for you. Hopefully this is correct?
Thanks in advance.
You can implement it in any browser. But, for Google Chrome, here is the solution -
profile = Selenium::WebDriver::Chrome::Profile.new
profile['download.prompt_for_download'] = false
profile['download.default_directory'] = "Absolute or relative path to your download directory"
browser = Selenium::WebDriver.for :chrome, :profile => profile
You haven't specified which gem you use for browser. But, even if you use watir-webdriver, you can use the same profile you created above with watir-webdriver.
browser = Watir::Browser.new :chrome, :profile => profile
I actually switched to using Sikuli, which worked pretty well. Thanks for the help, though.
Do you really need or want the browser to download the file? Are you really testing the browser's download feature, or do you want to verify that the server can serve the file and that it is what you expect?
I found the idea of setting up a default directory and having to check for the file clumsy, fragile and prone to errors, especially when setting up on a new host, especially for tests that run in multiple browsers.
My solution is to just use Ruby (or whatever language) features to download the file directly, and then validate that it is the file it's supposed to be. I'm not testing the browser, I'm testing the software. The only exception to that idea I can think of is if you use some javascript logic or something browser-dependent to redirect you to a link, but please don't ever do that.
However, you run into a problem if you have to log in to access your file; you either have to implement auth in your Ruby code, which isn't technically part of your Cucumber specification, or you need the cookies. I use this code to copy the cookies to avoid logging in again, and grab the file:
def assert_file_link(uri, filename, content_type)
f = open_uri_with_cookies uri
attachment_filename = f.meta["content-disposition"].sub("Attachment;filename=", "") # "Attachment;filename=Simple Flow - Simple Form.rtf"
content_length = Integer(f.meta["content-length"])
assert(f.status == ["200", "OK"], "Response was not 200 OK")
assert(f.content_type == content_type, "Expected content-type of '#{content_type}' but was '#{f.content_type}'")
assert(attachment_filename == filename, "Expected filename of '#{filename}' but was '#{attachment_filename}'")
assert(content_length > 0, "Expected content-length > 0 but was '#{content_length}'")
end
def open_uri_with_cookies(uri)
# hack the cookies from the existing session so we don't need to log in!
cookies = ""
#driver.manage.all_cookies.each { |cookie| cookies.concat("#{cookie[:name]}=#{cookie[:value]}; ") }
if block_given?
open(uri, "Cookie" => cookies, :proxy => nil) do |f|
yield f
end
else
open(uri, "Cookie" => cookies, :proxy => nil)
end
end
Hope this helps.

how to click on browser "stop loading this page" using ruby watir?

I want to click on browser "stop loading this page" icon, when timeout error occurs,what should i do for it? Some thing like this:
browser.stop # this is wrong.
I searched forever and the best I could come up with was:
b = Watir::Browser.new
b.driver.manage.timeouts.implicit_wait = 3
This worked great for me with firefox.
The below link may help you :
Stop loading page watir-webdriver
Is there have method to stop page loading in watir-webdriver
Just try this logic when you want to stop the loading,
#browser.send_keys :escape
It is not the best way of doing what you want but still the above approach worked out for me.
If you use Watir-Webdriver then you can try this for IE
#browser.execute_script "document.execCommand('Stop')"

Unexplained Inconsistency when Downloading an XLS file with Ruby Mechanize after redirect

I have a script that visits fcc.gov, then clicks a link which triggers a download:
require "mechanize"
docket_number = "12-268" #"96-128"
url = "http://apps.fcc.gov/ecfs/comment_search/execute?proceeding=#{docket_number}"
agent = Mechanize.new
agent.pluggable_parser.default = Mechanize::DirectorySaver.save_to 'downloads'
agent.get(url) do |page|
link = page.link_with(:text => "Export to Excel file")
xls = agent.click(link)
end
This works fine when docket_number is "12-268". But when you change it to "96-128", Mechanize downloads the html of the page instead of the desired spreadsheet.
The urls for both pages are:
http://apps.fcc.gov/ecfs/comment_search/execute?proceeding=12-268 (works)
http://apps.fcc.gov/ecfs/comment_search/execute?proceeding=96-128 (this is where I need help)
As you can see, if you visit each page in a browser (I'm using Chrome) and click "Export to Excel file", a spreadsheet file is downloaded and there is not problem. "96-128" has many more rows, so when you click on the Export link, it takes you to a new page that refreshes every 10 seconds or so until the file begins downloading. How can I get around this and why is there this inconsistency?
Clicking Export on 96-128 takes you to a page that refreshes using this kind of a tag (I've never heard of it before):
<meta http-equiv="refresh" content="5;url=/ecfs/comment_search/export?exportType=xls"/>
By default, Mechanize will not follow these refreshes. To get around that, change a setting on agent:
agent.follow_meta_refresh = true
Source: https://stackoverflow.com/a/2166480/94154
The proceeding 12-268 has 48 entries, 96-128 has 4046.
When I click at 'Export to Excel File' on the latter, there sometimes is a page saying:
Finished processing 933 of 4046 records.
Click if this page does not reload automatically.
I guess mechanize is seeing this, too.

Resources