Blacklist URLs with headless Chrome - ruby

I'm trying to block URLs in my specs, achieving something like I had when using capybara_webkit:
Capybara::Webkit.configure do |config|
config.block_url("*google*")
config.allow_url('*my_website.com')
end
After reading this article, I tried to do something like:
require 'webmock/rspec'
module WebmockConfig
def self.default_disabled_urls
[
'*google*'
]
end
end
WebMock.disable_net_connect!(allow_localhost: true)
WebMock.disable_net_connect!(allow: WebmockConfig.default_disabled_urls)
but I'm getting
Real HTTP connections are disabled. Unregistered request: POST http://127.0.0.1/session
even if that should be solved by WebMock.disable_net_connect!(allow_localhost: true).
When running the specs without WebMock.disable_net_connect!(allow: WebmockConfig.default_disabled_urls), everything is working fine.

The capybara-webkit white/blacklisting affects the requests made by the browser, whereas WebMock can only affect requests made by your app. This means WebMock is useless for what you want since it wouldn't actually stop your browser from loading anything from google, etc. To do that while using the selenium driver you need to use a programmable proxy like puffing-billy which will allow you to customize the responses for any matching requests the browser makes.
To configure a driver using headless chrome and puffing_billy you could do something like
Capybara.register_driver :headless_chrome do |app|
browser_options = ::Selenium::WebDriver::Chrome::Options.new
browser_options.headless!
browser_options.add_argument("--proxy-server=#{Billy.proxy.host}:#{Billy.proxy.port}")
Capybara::Selenium::Driver.new(app, browser: :chrome, options: browser_options)
end
Whether or not you need any other options is dependent on your system config, etc but you should be able to tell by looking at your current driver registration.

The allow_localhost: true settings are overwritten by allow: WebmockConfig.default_disabled_urls you have to call WebMock.disable_net_connect! once with both settings or by adding 'localhost', '127.0.0.1' entries into self.default_disabled_urls

Related

Capybara with headless chrome doesn't clear session between test cases which use different subdomains

I switched my rails tests from capybara-webkit to headless chrome. When I run a test which visits not the default Capybara host the first case passes but the second one fails because the user are already logged in when they try to login
I use chromedriver v2.45, selenium-webdriver (3.141.0) and capybara (2.18.0)
I have the following setup:
require 'selenium-webdriver'
Capybara.register_driver :chrome do |app|
options = Selenium::WebDriver::Chrome::Options.new(
args: %w[headless disable-gpu no-sandbox]
)
Capybara::Selenium::Driver.new(app, browser: :chrome, options: options)
end
Capybara.javascript_driver = :chrome
I tried to change the app host to the default domain after visiting another domain
using_app_host("http://another.lvh.me") do
visit '/'
# do something
end
where using_app_host is
def using_app_host(host)
original_host = Capybara.app_host
Capybara.app_host = host
yield
ensure
Capybara.app_host = original_host
end
but it didn't help.
The spec structure looks the following way:
feature "Use another subdomain", js: true do
before { login } # use default Capybara app host http://root.lvh.me
scenario "case 1" do
using_app_host("http://another.lvh.me") do
# do something
end
end
scenario "case 2" do
using_app_host("http://another.lvh.me") do
# do something else
end
end
end
Any ideas why capybara/headless chrome doesn't clean the user session between the test cases when navigating to another domain?
Are you storing session information in the browsers window.localStorage and/or window.sessionStorage? If so you can set those to be cleared via options passed to the driver (Note: these settings are the default for the selenium driver in Capybara 3.12+)
Capybara.register_driver :chrome do |app|
options = Selenium::WebDriver::Chrome::Options.new(args: %w[no-sandbox])
options.headless!
Capybara::Selenium::Driver.new(app, browser: :chrome, options: options, clear_local_storage: true, clear_session_storage: true)
end
Even I was facing same issue.
After adding the steps to clear the cookies, session it is not working either. I added below code in env.rb to start a new session every time for a new test
May be you can try this.
Before do
Capybara.session_name = ":session_#{Time.zone.now.to_i}"
end
After do
Capybara.current_session.driver.quit
end
Also, you can add in chrome options to open the session in incognito window
I found this thread useful in a reverse context. I have a test setup wherein I'm storing session credentials in local storage. And so upgrading from capybara v3.11 to v3.12 broke the suite such that only the first scenario would pass and the rest of the scenarios would fail on the login page every time.
That's because the local storage was getting cleared based on the default behavior of capybara 3.12
I updated my suite to set clear_local_storage and clear_session_storage to false explicitly at time of registering the driver.
Capybara.register_driver :selenium_chrome do |app|
Capybara::Selenium::Driver.new(app,
browser: :chrome,
clear_local_storage: false,
clear_session_storage: false)

How can I fake a response to Capybara/poltergeist using webmock?

I'm testing a webscraper and I'd like to use Webmock to deliver fake websites for faster testing. When I mock a website, Ruby's native HTTP library works fine, but Capybara doesn't seem capable of receiving the mocked response. I know that webmock is stubbing low level HTTP requests, and I assume it matters which one capybara uses and which one webmock is configured to use. However, I need to know how Capybara makes HTTP requests and how I can configure webmock to stub that particular method set.
require 'capybara/poltergeist'
require 'webmock'
require 'pry'
include WebMock::API
WebMock.disable_net_connect!(allow_localhost:true)
Capybara.register_driver :poltergeist do |app|
Capybara::Poltergeist::Driver.new(app, js_errors: false)
end
# Configure Capybara to use Poltergeist as the driver
Capybara.default_driver = :poltergeist
Capybara.javascript_driver = :poltergeist
U = /google.com/
b = Capybara.current_session
stub_request(:any, U).
with(:headers => {'Accept'=>'*/*', 'Accept-Encoding'=>'gzip;q=1.0,deflate;q=0.6,identity;q=0.3', 'User-Agent'=>'Ruby'}).
to_return(status:200, body:"abc", headers:{})
puts Net::HTTP.get(U,'/') #=> This returns "abc"
b.visit U
puts b.html #=> Throws error
The error I'm getting is as follows:
command': Request failed to reach server, check DNS and/or server status (Capybara::Poltergeist::StatusFailError)
I've tried using FakeWeb as well, but that simply was not capable of registering URIs. I'm open to using other APIs besides webmock if you think this is the wrong tool for the job.
Thanks in advance :)
Tom Walpole is correct. You can use WebMock to mock things your server is connecting to, but the browser makes its own connections and is unaffected by the changes you make to the server.
If you want to fake responses that the browser requests from other servers try something like Puffing Billy. Take a look at the Caching capability which can be setup to re-play results (much like VCR).
If you're working with something VERY simple you could try just loading the data you need with Capybara.string. But that's probably too limited for what you want.
Capybara doesn't make web requests, it tells the browser where to visit and the browser in turn makes the request. The way to do what you want is to use a proxy that can redirect specific browser requests to your own app
There is a newer and better way of doing this.
# spec/spec_helper.rb
RSpec.configure do |config|
config.before(:each) do |example|
if example.metadata[:type] == :feature
Capybara::Webmock.start
end
end
config.after(:suite) do
Capybara::Webmock.stop
end
end
Then use the capybara_webmock JavaScript driver:
# Use Chrome Driver
Capybara.javascript_driver = :capybara_webmock_chrome
https://github.com/hashrocket/capybara-webmock

Exclude not working for Rack::SSL

I currently have a sinatra project that I am trying to add SSL to so I tried to add Rack::SSL which worked fine, but I'd like to have it disabled in development mode.
class Blog < Sinatra::Base
use Rack::SSL, :exclude => lambda { |env| ENV['RACK_ENV'] != 'production' }
...
This is the code I have and ENV['RACK_ENV'] is returning 'development' when I pry, but for some reason when I try to hit my site locally it's still trying to redirect to https.
I got caught out by this last week. Turns out I'd enabled HTTP Strict Transport Security (HSTS) too, which meant once a cookie for the site had been served over HTTPS the browser would prevent any future requests to the non-HTTPS version of the site.
Thought I'd mention it just incase you've got the same.
This works, chrome just had the redirect cached from before I added this so I thought it wasn't working. Worked fine in an incognito window.

Enabling HTML5 cache manifest in Poltergeist/PhantomJS tests

My app is using HTML5 cache manifest file and caches several js/css/html files on the client side. We are having problems with Poltergeist testing - same tests run with Selenium pass. With Poltergeist the first test passes (the files are not yet cached) but the second and all the rest fail, the page is blank as if the cache is not working. I tried to enable the PhantomJS disk cache by passing options to Poltergeist, in test_helper.rb (Rails' Test::Unit) I declared the poltergeist driver as:
Capybara.register_driver :poltergeist do |app|
Capybara::Poltergeist::Driver.new(MyRailsAppName, { phantomjs_options: ['--disk-cache=true' ] } )
end
Capybara.javascript_driver = :poltergeist
But this doesn't help. Any ideas?
[edit]: don't know if this is relevant but when I pause the test in the failing run and manually visit the page with cached content with
visit '/mobile'
=> {"status"=>"fail"}
status is failing, but when I visit a non-cached page, it works.
Ok, so guys at PhantomJS are working on enabling localStorage support, it hasn't been merged yet.

watir-webdriver change proxy while keeping browser open

I am using the Watir-Webdriver library in Ruby to check some pages. I know I can connect through a proxy using
profile = Selenium::WebDriver::Firefox::Profile.new#create a new profile
profile.proxy = Selenium::WebDriver::Proxy.new(#create proxy data for in the profile
:http => proxyadress,
:ftp => nil,
:ssl => nil,
:no_proxy => nil
)
browser = Watir::Browser.new :firefox, :profile => profile#create a browser window with this profile
browser.goto "http://www.example.com"
browser.close
However, when wanting to connect to the same page multiple times using different proxies, I have to create a new browser for every proxy. Loading(and unloading) the browser takes quite some time.
So, my question: Is there any way to change, using webdriver in ruby, the proxy adress Firefox uses to connect through while keeping the browser open?
If you want to test whether a page is blocked when accessed through a proxy server, you can do that through a headless library. I recently had success using mechanize. You can probably use net/http as well.
I am still not sure why you need to change the proxy server for a current session.
require 'Mechanize'
session = Mechanize.new
session.set_proxy(host, port, user, pass)
session.user_agent='Mac Safari'
session.agent.robots = true #observe of robots.txt rules
response = session.get(url)
puts response.code
You need to supply the proxy host/port/user/pass (user/pass are optional), and the url. If you get an exception, then the response.code is probably not friendly.
You may need to use an OS level automation tool to automate going through the FF menus to change the setting as a user would.
For windows users there is the option of either the new RAutomation tool, or AutoIT. both can be used to automate things at the OS UI level, which would let you go into the browser settings and change the proxy there.
Still I'd think if you are checking a larger number of sites that the overhead to change the proxy settings would not be that much compared to all of the site navigation and waiting for pages to load etc.
Unless you are currently taking a 'row traverse' approach and changing proxy settings multiple times for each site you are checking? If that's the case I would go towards more of a by-column method (if we were to presume each column is a proxy, and each row is a site) and fire up the browser for one proxy, check all the sites, then change the proxy and re-check all the sites. That way you'd only be changing the proxy settings once for each proxy which should not add that much overhead to your script.
It might mean a little more work with storing and then reporting results at the end (if you had been writing them out a line at a time) but that's what hashes or arrays are for.

Resources