Watir-webdriver doesnt store all cookies - firefox

When I goto the following link on firefox (V-12), the browser on my Ubuntu machine allows me to login normally.
https://r.espn.go.com/members/v3_1/login?language=en&forwardUrl=&appRedirect=http%3A%2F%2Fgames.espn.go.com
However, if I use watir-webdriver, I get the message: "Cookies must be enabled in order to login."
Here is the code to reproduce this issue with Watir:
require 'watir-webdriver'
browser = Watir::browser.new
browser.goto "https://r.espn.go.com/members/v3_1/login?language=en&forwardUrl=&appRedirect=http%3A%2F%2Fgames.espn.go.com"
You will notice that the browser displays the "Cookies must be enabled" error message below the "email address or member name" field. When I looked at the cookies stored, I noticed that not all cookies that were stored in the normal mode are available. I compared this by searching for "go.com" in the stored cookies.
Any idea what would cause the discrepancy in cookies stored between the two modes, using the same browser?
Thanks!

There is no problem or discrepancy with watir-webdriver. What is happening here is a result of how the website is coded.
The page you are accessing (https://r.espn.go.com/members/v3_1/login?language=en&forwardUrl=&appRedirect=http%3A%2F%2Fgames.espn.go.com) is intended to be an overlay on http://espn.go.com. Whoever coded the site assumed that the overlay page would always be accessed after a hit to the main page. So, the main page (http://espn.go.com) sets a cookie in order to test whether your user agent has cookies enabled. The overlay page with the sign in form then checks to see if the test cookie is present and, if not, displays the warning you are seeing.
What is important to understand is that watir-webdriver defaults to a clean profile for each new browser instance. This means that the browser does not have any of your cookies, extensions, preferences or browsing history. Because the clean profile has never visited http://espn.go.com to receive the test cookie, the warning is being displayed.
There are two ways to avoid this warning:
You can visit the main page prior to the sign-in page, like so:
require 'watir-webdriver'
browser = Watir::Browser.new
browser.goto "espn.go.com"
browser.goto "https://r.espn.go.com/members/v3_1/login?language=en&forwardUrl=&appRedirect=http%3A%2F%2Fgames.espn.go.com"
Or, you can use your default Firefox profile, which (presumably) already has the test cookie:
require 'watir-webdriver'
browser = Watir::Browser.new :firefox, :profile => "default"
browser.goto "https://r.espn.go.com/members/v3_1/login?language=en&forwardUrl=&appRedirect=http%3A%2F%2Fgames.espn.go.com"
Hope that helps!

Related

How to use ruby to scrape a react website from a remote linux server?

I want to scrape a react website using the ruby watir gem on a remote linux server but keep getting the following error:
/var/lib/gems/2.3.0/gems/selenium-webdriver-3.142.3/lib/selenium/webdriver/firefox/binary.rb:134:in
path': can't modify frozen String (RuntimeError) from
/var/lib/gems/2.3.0/gems/selenium-webdriver-3.142.3/lib/selenium/webdriver/common/service.rb:45:in
firefox'
Here is my code:
require 'watir'
browser = Watir::Browser.new :firefox, headless: true
browser.goto("https://www.pinterest.com")
There is a similar question here but the links either return 404 or are archived and the code deprecated.
I need to login, then get a new page and push buttons on that page to download a report file for a date range.
You'll get that error if Firefox isn't installed, or isn't accessible on your path. Reinstall if you already have it.
Source: selenium/webdriver/firefox/binary.rb:134:in `path': can't modify frozen String (FrozenError)
So, you might want to use Firecast, a reinstall might help. In case you have a different browser, you could test with Chrome for instance.
Some more things to look at:
You might also need to install the right webdriver. You can also use https://github.com/titusfortner/webdrivers
I have got the same error as you posted, then I ran gem install webdrivers and used it in the code, also I switched to chrome:
require 'watir'
require 'webdrivers'
browser = Watir::Browser.new :chrome, headless: true
browser.goto("https://www.pinterest.com")
Finally, without webdrivers you get something like
C:/tools/ruby26/lib/ruby/gems/2.6.0/gems/selenium-webdriver-3.142.3/lib/selenium/webdriver/common/service.rb:136:in `binary_path': Unable to find chromedriver. Please download the server from (Selenium::WebDriver::Error::WebDriverError)
https://chromedriver.storage.googleapis.com/index.html and place it somewhere on your PATH.
More info at https://github.com/SeleniumHQ/selenium/wiki/ChromeDriver.
and with all set up correctly you might get something like (most likely related to Chrome):
DevTools listening on ws://127.0.0.1:57725/devtools/browser/34a42518-c3d9-4e14-af8e-9a137b11625b
[0808/012434.304:INFO:CONSOLE(0)] "The Content-Security-Policy directive 'prefetch-src' is implemented behind a flag which is currently disabled.
", source: https://www.pinterest.com/ (0)
[0808/012437.286:INFO:CONSOLE(240)] "No signed in Google accounts available - visit accounts.google.com to ensure that at least one account is signed in, otherwise no data will be returned from this API.", source: https://www.gstatic.com//mss/boq-identity//js/k=boq-identity.IdentityYoloWebModuleset.en_US.fUFh6X86RzU.es5.O/am=Aw/d=1/rs=AOaEmlH5BdY58S_qoulxSYv6tYMpThlVYw/m=yolo_frame_library (240)

Enabling HTML5 cache manifest in Poltergeist/PhantomJS tests

My app is using HTML5 cache manifest file and caches several js/css/html files on the client side. We are having problems with Poltergeist testing - same tests run with Selenium pass. With Poltergeist the first test passes (the files are not yet cached) but the second and all the rest fail, the page is blank as if the cache is not working. I tried to enable the PhantomJS disk cache by passing options to Poltergeist, in test_helper.rb (Rails' Test::Unit) I declared the poltergeist driver as:
Capybara.register_driver :poltergeist do |app|
Capybara::Poltergeist::Driver.new(MyRailsAppName, { phantomjs_options: ['--disk-cache=true' ] } )
end
Capybara.javascript_driver = :poltergeist
But this doesn't help. Any ideas?
[edit]: don't know if this is relevant but when I pause the test in the failing run and manually visit the page with cached content with
visit '/mobile'
=> {"status"=>"fail"}
status is failing, but when I visit a non-cached page, it works.
Ok, so guys at PhantomJS are working on enabling localStorage support, it hasn't been merged yet.

Screenshot of the URL section of the browser

I want to capture screenshot of the browser URL section.
browser.screenshot.save ('tdbank.png')
It will save the entire page of internal part of the browser, but I want to capture the URL header part of the browser. Any suggestion?
Sometime, URL is saying http or https. I want to capture this in screenshot and archive it. I know I could get it through,
url = browser.url
then do some comparison. I need this for legal purpose and it should be done by taking a screenshot.
thanks in advance.
If you're on windows, you could use the win32screenshot gem. For example:
require 'watir-webdriver'
require 'win32/screenshot'
b = Watir::Browser.new # using firefox as default browser
b.goto('http://www.example.org')
Win32::Screenshot::Take.of(:window, :title => /Firefox/).write("image.bmp")
b.close

why we need user_agent_alias with mechanize object?

I just wondering for some informations about mechanize and found the below code from Internet:
require 'mechanize'
require 'logger'
agent = Mechanize.new
agent.user_agent_alias = 'Windows IE 9'
agent.follow_meta_refresh = true
agent.log = Logger.new(STDOUT)
Could any one please explain why user_agent_alias and follow_meta_refresh is needed when,mechanize itself is a browser?
Mechanize isn't a browser. It is a page parser that gives you enough methods to make it easy/convenient to navigate through a site. But, in no way is it a browser.
user_agent_alias sets the signature of Mechanize when it's running and making page requests. In your example it's trying to spoof a site by masquerading as "IE 9", but that signature isn't going to fool any system that is sniffing the User-Agent header.
follow_meta_refresh, well, you should take the time to search for "meta" tags with the "refresh" parameter. It's trivial to find out about it, and, then you'll understand. Or just read the documentation for it.

watir-webdriver change proxy while keeping browser open

I am using the Watir-Webdriver library in Ruby to check some pages. I know I can connect through a proxy using
profile = Selenium::WebDriver::Firefox::Profile.new#create a new profile
profile.proxy = Selenium::WebDriver::Proxy.new(#create proxy data for in the profile
:http => proxyadress,
:ftp => nil,
:ssl => nil,
:no_proxy => nil
)
browser = Watir::Browser.new :firefox, :profile => profile#create a browser window with this profile
browser.goto "http://www.example.com"
browser.close
However, when wanting to connect to the same page multiple times using different proxies, I have to create a new browser for every proxy. Loading(and unloading) the browser takes quite some time.
So, my question: Is there any way to change, using webdriver in ruby, the proxy adress Firefox uses to connect through while keeping the browser open?
If you want to test whether a page is blocked when accessed through a proxy server, you can do that through a headless library. I recently had success using mechanize. You can probably use net/http as well.
I am still not sure why you need to change the proxy server for a current session.
require 'Mechanize'
session = Mechanize.new
session.set_proxy(host, port, user, pass)
session.user_agent='Mac Safari'
session.agent.robots = true #observe of robots.txt rules
response = session.get(url)
puts response.code
You need to supply the proxy host/port/user/pass (user/pass are optional), and the url. If you get an exception, then the response.code is probably not friendly.
You may need to use an OS level automation tool to automate going through the FF menus to change the setting as a user would.
For windows users there is the option of either the new RAutomation tool, or AutoIT. both can be used to automate things at the OS UI level, which would let you go into the browser settings and change the proxy there.
Still I'd think if you are checking a larger number of sites that the overhead to change the proxy settings would not be that much compared to all of the site navigation and waiting for pages to load etc.
Unless you are currently taking a 'row traverse' approach and changing proxy settings multiple times for each site you are checking? If that's the case I would go towards more of a by-column method (if we were to presume each column is a proxy, and each row is a site) and fire up the browser for one proxy, check all the sites, then change the proxy and re-check all the sites. That way you'd only be changing the proxy settings once for each proxy which should not add that much overhead to your script.
It might mean a little more work with storing and then reporting results at the end (if you had been writing them out a line at a time) but that's what hashes or arrays are for.

Resources