Ruby/Watir: Clicking elements with role="button" - ruby

I'm trying to write a script that will automatically click a button when entering a site.
The HTML of the site is as follows:
<span id="zolaDisclaimerButton" class="dijitReset dijitStretch dijitButtonContents" waistate="labelledby-zolaDisclaimerButton_label" wairole="button" dojoattachpoint="titleNode,focusNode" role="button" aria-labelledby="zolaDisclaimerButton_label" tabindex="0" title="I acknowledge this disclaimer... Let ZoLa begin!" style="-moz-user-select: none;">
I have tried the following and they don't work:
browser.span(:id, "zolaDisclaimerButton").click
browser.button(:id, "zolaDisclaimerButton").click
How do I go about clicking those types of button? The URL in question is:
http://gis.nyc.gov/doitt/nycitymap/template?applicationName=ZOLA
EDIT: this is the code I use:
require "watir"
browser = Watir::Browser.new
browser.goto "http://gis.nyc.gov/doitt/nycitymap/template?applicationName=ZOLA"
browser.span(:id, "zolaDisclaimerButton").click
puts "fin"
It navigates to the page, doesn't click anything, then prints 'fin' (to let me know it's done). No exception is thrown.

If the page is not loading in time, you can add waits - see http://watirwebdriver.com/waiting/. These wait methods will wait until either the condition is met or a time limit is exceeded. It is better than using sleep since it only waits as long as needed (rather than waiting 2 seconds for something that loads in 1 second).
In this case, use the when_present method on the span before clicking it:
require "watir"
browser = Watir::Browser.new
browser.goto "http://gis.nyc.gov/doitt/nycitymap/template?applicationName=ZOLA"
browser.span(:id, "zolaDisclaimerButton").when_present.click
puts "fin"

Related

Click to all different next pages into a loop until the last page with Watir gem

i have a problem in my ruby watir script.
I want to click through all next pages until the last page, and then puts some first name and last name. I know that the last "next" link is called with one more class "disabled" stop = b.link(class: 'next-pagination page-link disabled').
I try to loop until this classes is reached break if stop.exists?
loop do
link = b.link(class: 'next-pagination page-link')
name_array = b.divs(class: 'name-and-badge-container').map { |e| e.div(class:'name-container').link(class: 'name-link profile-link').text.split("\n") }
puts name_array
stop = b.link(class: 'next-pagination page-link disabled')
break if stop.exists?
link.click
end
I have this error :
This code has slept for the duration of the default timeout waiting for an Element to exist. If the test is still passing, consider using Element#exists? instead of rescuing UnknownObjectException
/Users/vincentcheloudiakoff/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/watir-6.2.1/lib/watir/elements/element.rb:496:in rescue in wait_for_exists': timed out after 30 seconds, waiting for #<Watir::Div: located: false; {:class=>"name-and-badge-container", :tag_name=>"div", :index=>13}> to be located (Watir::Exception::UnknownObjectException)
from /Users/vincentcheloudiakoff/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/watir-6.2.1/lib/watir/elements/element.rb:486:inwait_for_exists'
from /Users/vincentcheloudiakoff/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/watir-6.2.1/lib/watir/elements/element.rb:487:in wait_for_exists'
from /Users/vincentcheloudiakoff/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/watir-6.2.1/lib/watir/elements/element.rb:487:inwait_for_exists'
from /Users/vincentcheloudiakoff/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/watir-6.2.1/lib/watir/elements/element.rb:639:in element_call'
from /Users/vincentcheloudiakoff/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/watir-6.2.1/lib/watir/elements/element.rb:91:intext'
from /Users/vincentcheloudiakoff/Travail/Automation/lib/linkedin.rb:24:in block (2 levels) in start'
from /Users/vincentcheloudiakoff/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/watir-6.2.1/lib/watir/element_collection.rb:28:ineach'
from /Users/vincentcheloudiakoff/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/watir-6.2.1/lib/watir/element_collection.rb:28:in each'
from /Users/vincentcheloudiakoff/Travail/Automation/lib/linkedin.rb:24:inmap'
from /Users/vincentcheloudiakoff/Travail/Automation/lib/linkedin.rb:24:in block in start'
from /Users/vincentcheloudiakoff/Travail/Automation/lib/linkedin.rb:22:inloop'
from /Users/vincentcheloudiakoff/Travail/Automation/lib/linkedin.rb:22:in start'
from start.rb:3:in'
It clicks on the next page, but does not find the next disabled button.
Use the text to locate that element
b.span(text: 'Suivant').click
You don't have to use parent link and then span like b.link().span() instead you can directly locate span the way I have explained.

How to select an inline element using watir in ruby

I want the text "HEATMAPS" to be clicked after the webpage is opened. I have tried number of click methods, including recognizing as hyperlink, as text, as using xpath etc. None of them worked. I feel, I am either misunderstanding the links, as to be a hyperlink or choosing a wrong xpath.
Link of the web page
PFB the code below
require 'watir-webdriver'
require 'watir-ng'
WatirNg.patch!
WatirNg.register(:ng_scope).patch!
browser = Watir::Browser.new
browser.goto 'http://app.vwo.com/#/campaign/108/summary? token=eyJhY2NvdW50X2lkIjoxNTA3MzQsImV4cGVyaW1lbnRfaWQiOjEwOCwiY3JlYXRlZF9vbiI6MTQ0NDgxMjQ4MSwidHlwZSI6ImNhbXBhaWduIiwidmVyc2lvbiI6MSwiaGFzaCI6IjJmZjk3OTVjZTgwNmFmZjJiOTI5NDczMTc5YTBlODQxIn0='
lin = browser.link :text=> 'HEATMAPS'
lin.exist?
lin.click
Can someone please guide me on this, as to how I can make that link with the text "HEATMAPS" in the page get clicked.
The error i get:
`This code has slept for the duration of the default timeout waiting for an Element to exist. If the test is still passing, consider using Element#exists? instead of rescuing UnknownObjectException
C:/Ruby22/lib/ruby/gems/2.2.0/gems/watir-6.1.0/lib/watir/elements/element.rb:507:in `rescue in wait_for_exists': timed out after 30 seconds, waiting for {:text=>"HEATMAPS", :tag_name=>"a"} to be located (Watir::Exception::UnknownObjectException)
from C:/Ruby22/lib/ruby/gems/2.2.0/gems/watir-6.1.0/lib/watir/elements/element.rb:497:in `wait_for_exists'
from C:/Ruby22/lib/ruby/gems/2.2.0/gems/watir-6.1.0/lib/watir/elements/element.rb:515:in `wait_for_present'
from C:/Ruby22/lib/ruby/gems/2.2.0/gems/watir-6.1.0/lib/watir/elements/element.rb:533:in `wait_for_enabled'
from C:/Ruby22/lib/ruby/gems/2.2.0/gems/watir-6.1.0/lib/watir/elements/element.rb:656:in `element_call'
from C:/Ruby22/lib/ruby/gems/2.2.0/gems/watir-6.1.0/lib/watir/elements/element.rb:114:in `click'
from C:/Users/Mrityunjeyan/Documents/GitHub/Simpleprograms/webautomation.rb:10:in `<main>'`
This would display me the inner_html text but still wouldnt click
lin = browser.span(:class => 'ng-scope').inner_html
puts lin
The problem is that the link's text is not "HEATMAPS". It is actually "Heatmaps". The text locator is case-sensitive, which means you need:
lin = browser.link :text=> 'Heatmaps'
You can see this if you inspect the HTML:
<a ui-sref="campaign.heatmap-clickmap" href="#/analyze/analysis/108/heatmaps?token=eyJhY2NvdW50X2lkIjoxNTA3MzQsImV4cGVyaW1lbnRfaWQiOjEwOCwiY3JlYXRlZF9vbiI6MTQ0NDgxMjQ4MSwidHlwZSI6ImNhbXBhaWduIiwidmVyc2lvbiI6MSwiaGFzaCI6IjJmZjk3OTVjZTgwNmFmZjJiOTI5NDczMTc5YTBlODQxIn0%3D">
<!-- boIf: isAnalyticsCampaign -->
<span bo-if="isAnalyticsCampaign" class="ng-scope">Heatmaps</span>
<!-- boIf: !isAnalyticsCampaign -->
</a>
It only looks like "HEATMAPS" due to the styling. One of the styles includes a text-transform: uppercase; which visually capitalizes the text. Watir does not interpret the styles, so only knows that the text node is "Heatmaps".
Once you have identified the link, clicking still has a problem:
lin.click
#=> Selenium::WebDriver::Error::UnknownError:
#=> unknown error: Element <a ui-sref="analyze.heatmaps" ng-class="{selected: (locationContains('analyze', 'heatmap') && !locationContains('analyze', '/analysis') && state.current.name !== 'campaign.heatmap-clickmap') || (isAnalyzeHeatmapEnabled && (isAnalyzeDeprecatedHeatmapView || locationContains('analyze', '/analysis', 'heatmaps')) )}" data-qa="nav-main-analyze-heatmap" href="#/analyze/heatmap">...</a>
#=> is not clickable at point (90, 121).
#=> Other element would receive the click: <div ng-show="!isCROSetupView" class="">...</div>
The link being located is actually the one in the left menu, which is disabled, rather than the top menu. You need to scope the link locator to just the top menu. Using the parent ul element appears to be sufficient:
lin = browser.ul(class: 'page-nav').link(text: 'Heatmaps')
lin.click
To see the click complete, you might want to tell Chrome not to close at the end of the script. This is done by opening the browser using the following option:
caps = Selenium::WebDriver::Remote::Capabilities.chrome("chromeOptions" => {'detach' => true})
browser = Watir::Browser.new :chrome, desired_capabilities: caps
Your target link doesn't have any text. I confirmed with nokogiri that the text of the <a> tag includes the text inside the child <span> tag, and it turns out Watir works the same way.
If you use a Chrome browser, you can select:
View > Developer > Developer tools
Then you can select an element on the page, and the corresponding section in the html will be highlighted. Here is what the html looks like:
<a ui-sref="campaign.heatmap-clickmap"
href="#/analyze/analysis/108/heatmaps?token=eyJhY2NvdW50X2lkIjoxNTA3MzQsImV4cGVyaW1lbnRfaWQiOjEwOCwiY3JlYXRlZF9vbiI6MTQ0NDgxMjQ4MSwidHlwZSI6ImNhbXBhaWduIiwidmVyc2lvbiI6MSwiaGFzaCI6IjJmZjk3OTVjZTgwNmFmZjJiOTI5NDczMTc5YTBlODQxIn0%3D">
<!-- boIf: isAnalyticsCampaign -->
<span bo-if="isAnalyticsCampaign" class="ng-scope">Heatmaps</span>
<!-- boIf: !isAnalyticsCampaign --> </a>
The basic structure is:
<a><span>Heatmaps</span></a>
Chrome even has a feature where you can right click on an element and get its xpath, which you can use with Watir.
However, when I execute my program to go to that page, I see Chrome launch, and then I'm presented with a login page rather than the page that was presented to me at your link.
Arghhh...what a colossal waste of time. I thought Watir must be broken. With the proper url, I can get the link using several different techniques:
1)
require 'watir'
require 'watir-ng' #<=NOTE THIS
WatirNg.register(:'bo_if').patch! #<= NOTE THIS
br = Watir::Browser.new :chrome
br.goto 'http://app.vwo.com/#/analyze/analysis/108/summary?token=eyJhY2NvdW50X2lkIjoxNTA3MzQsImV4cGVyaW1lbnRfaWQiOjEwOCwiY3JlYXRlZF9vbiI6MTQ0NDgxMjQ4MSwidHlwZSI6ImNhbXBhaWduIiwidmVyc2lvbiI6MSwiaGFzaCI6IjJmZjk3OTVjZTgwNmFmZjJiOTI5NDczMTc5YTBlODQxIn0%3D'
target_link = br.span(
class: 'ng-scope',
'bo_if': 'isAnalyticsCampaign',
).parent #<= NOTE THIS
puts target_link.text #HEATMAPS
puts target_link.href #http://app.vwo.com/#/analyze/analysis/1.....
2)
require 'watir'
require 'watir-ng' #<=NOTE THIS
WatirNg.register(:ui_sref).patch! #<=NOTE THIS
br = Watir::Browser.new :chrome
br.goto 'http://app.vwo.com/#/analyze/analysis/108/summary?token=eyJhY2NvdW50X2lkIjoxNTA3MzQsImV4cGVyaW1lbnRfaWQiOjEwOCwiY3JlYXRlZF9vbiI6MTQ0NDgxMjQ4MSwidHlwZSI6ImNhbXBhaWduIiwidmVyc2lvbiI6MSwiaGFzaCI6IjJmZjk3OTVjZTgwNmFmZjJiOTI5NDczMTc5YTBlODQxIn0%3D'
target_link = br.link(
ui_sref: 'campaign.heatmap-clickmap',
href: '#/analyze/analysis/108/heatmaps?token=eyJhY2NvdW50X2lkIjoxNTA3MzQsImV4cGVyaW1lbnRfaWQiOjEwOCwiY3JlYXRlZF9vbiI6MTQ0NDgxMjQ4MSwidHlwZSI6ImNhbXBhaWduIiwidmVyc2lvbiI6MSwiaGFzaCI6IjJmZjk3OTVjZTgwNmFmZjJiOTI5NDczMTc5YTBlODQxIn0%3D'
)
puts target_link.text #HEATMAPS
puts target_link.href #http://app.vwo.com/#/analyze/analysis/108...
3) Getting the xpath by right clicking on the link in Chrome:
require 'watir'
br = Watir::Browser.new :chrome
br.goto 'http://app.vwo.com/#/analyze/analysis/108/summary?token=eyJhY2NvdW50X2lkIjoxNTA3MzQsImV4cGVyaW1lbnRfaWQiOjEwOCwiY3JlYXRlZF9vbiI6MTQ0NDgxMjQ4MSwidHlwZSI6ImNhbXBhaWduIiwidmVyc2lvbiI6MSwiaGFzaCI6IjJmZjk3OTVjZTgwNmFmZjJiOTI5NDczMTc5YTBlODQxIn0%3D'
target_link = br.link(
xpath: '//*[#id="main-container"]/ul/li[3]/a'
)
puts target_link.text #HEATMAPS
puts target_link.href #http://app.vwo.com/#/analyze/analysis/108....
xpath can handle any tag or attribute name, unlike Watir, so it seems like a good tool in that regard.
4) A less brittle xpath:
require 'watir'
br = Watir::Browser.new :chrome
br.goto 'http://app.vwo.com/#/analyze/analysis/108/summary?token=eyJhY2NvdW50X2lkIjoxNTA3MzQsImV4cGVyaW1lbnRfaWQiOjEwOCwiY3JlYXRlZF9vbiI6MTQ0NDgxMjQ4MSwidHlwZSI6ImNhbXBhaWduIiwidmVyc2lvbiI6MSwiaGFzaCI6IjJmZjk3OTVjZTgwNmFmZjJiOTI5NDczMTc5YTBlODQxIn0%3D'
a_href = "#/analyze/analysis/108/heatmaps?token=eyJhY2NvdW50X2lkIjoxNTA3MzQsImV4cGVyaW1lbnRfaWQiOjEwOCwiY3JlYXRlZF9vbiI6MTQ0NDgxMjQ4MSwidHlwZSI6ImNhbXBhaWduIiwidmVyc2lvbiI6MSwiaGFzaCI6IjJmZjk3OTVjZTgwNmFmZjJiOTI5NDczMTc5YTBlODQxIn0%3D"
target_link = br.link(
xpath: %Q{ //a[#href="#{a_href}"] } +
'[#ui-sref="campaign.heatmap-clickmap"]' +
'[child::span[#class="ng-scope"][#bo-if="isAnalyticsCampaign"][text()="Heatmaps"]]'
)
The xpath looks for an <a> tag with two attributes:
//a[#href="blah"][#ui-sref="bleh"]
which has a child <span> (i.e. a direct child) with three attributes:
[child::span[#class="blih"][#bo-if="bloh"][text()="Heatmaps"]]
After I programmatically click the link:
target_link.click
Watir goes to the next page.

how to test / interact with AJAX autocomplete and Capybara / Poltergeist

I am trying to interact with an external website at: http://is.gd/LtgYEk
I need to be able to fill in the input with id="textOrigen" here is the html
<p>
<label class="form-label">Departing from:</label>
<span class="label-field">
<input type="text" autocomplete="off" onblur="onblur2('textOrigen');" onfocus="initID('textOrigen');" size="17" id="textOrigen" name="text" maxlength="40" style="position:static;color: #505050;">
<select style="display:none" onchange="clearValidate(); Origen();" class="validate[dynamic]" id="cmbOrigen" name="cmbOrigen">
<option value="-1" selected="selected">Origin</option>
</select>
<label class="label-error" id="lblerrorOrigen"></label>
</span>
</p>
I put together a simple ruby script using 'capybara/poltergeist'
I am unable to replicate the browser behavior, which is:
on click the input field default value is highlighted thus being deleted as you start typing.
I lost track of all different variations I tried, but tried many. I found another SO post which seemed somewhat useful but it didn't help
This is the last revision of the method to fill this field:
def session.fill_autocomplete(field, options = {})
execute_script %Q{ $('##{field}').trigger('focus') }
fill_in field, with: options[:with]
execute_script %Q{ $('##{field}').trigger('focus') }
execute_script %Q{ $('##{field}').trigger('keydown') }
selector = %Q{#output div:contains('#{options[:with]}')}
execute_script "$(\"#{selector}\").mouseenter().click()"
end
As I wrote the script is very simple, the only other relevant bit is when the session is instantiated with:
session = Capybara::Session.new(:poltergeist)
Any help would be greatly appreciated.
I noticed that using the right version of phantomjs is fundamental.
Although 2.x is out, I noticed that phantomjs 1.8.2 behaves much more as expected and is way less buggy.
I'm currently testing autocompleted fields in RailsAdmin with success without using any delay technique.
def fill_in_autocomplete(selector, text)
find(selector).native.send_keys(*text.chars)
end
def choose_autocomplete_entry(text)
find('ul.ui-autocomplete').should have_content(text)
page.execute_script("$('.ui-menu-item:contains(\"#{text}\")').find('a').trigger('mouseenter').click()")
end
An example selector for fill_in_autocomplete would be:
".author_field .ui-autocomplete-input"
I found the solution after testing in many ways.
The key was to add some delay to allow the auto suggest div to be populated.
Here is the method that worked:
def session.fill_city(field, options = {})
sleep 3
script = %Q{ $("#{field}").focus().keypress().val("#{options[:with]}") }
execute_script(script)
sleep 2
find('#output').find('div').trigger('click')
end

Selenium-Webdriver Ruby --> How to wait for images to be fully loaded after click

I am very new to Ruby and Selenium-Webdriver, so please, help :)
I am trying to open email campaign , sent to my inbox, that has images and take a screenshot in the firefox. But i can not make it wait until images is fully loaded. Once i click on 'Show images' , screenshot is already taken , but image is not loaded at that time. How can i pause the script and take screenshot some time later, after all images is displayed?
Please, help :(
Bellow is my script:
enter code here
require 'selenium-webdriver'
browser = Selenium::WebDriver.for :firefox
#==========================================================================================
wait = browser.manage.timeouts.implicit_wait = 15
#==========================================================================================
url = 'https://login.yahoo.com/config/login_verify2?.intl=us&.src=ym'
# Open browser (firefox)
browser.navigate.to url
browser.find_element(:id, 'username').send_keys "some yahoo id"
browser.find_element(:id, 'passwd').send_key "some password"
browser.find_element(:id, ".save").click
browser.find_element(:id, "inbox-label").click
browser.find_element(:xpath, "//div[#class='subj']").click
browser.find_element(:xpath, "//a[#title='Display blocked images']").click
result_page_title = browser.find_element(:tag_name, 'title')
puts "Title of the page: \t\t: #{result_page_title.text}"
browser.save_screenshot "1.jpg"
You can use Implicit Wait and Explicit Wait to wait for a particular Web Element until it appears in the page. The wait period you can define and that is depends upon the application.
Explicit Wait:
An explicit waits is code you define to wait for a certain condition to occur before proceeding further in the code. If the condition achieved it will terminate the wait and proceed the further steps.
Code:
WebDriverWait wait = new WebDriverWait(driver,30);
wait.until(ExpectedConditions.visibilityOfElementLocated(By.id(strEdit)));
Or
WebElement myDynamicElement = (new WebDriverWait(driver, 30))
.until(new ExpectedCondition<WebElement>(){
#Override
public WebElement apply(WebDriver d) {
return d.findElement(By.id("myDynamicElement"));
}});
This waits up to 30 seconds before throwing a TimeoutException or if it finds the element will return it in 0 - 30 seconds. WebDriverWait by default calls the ExpectedCondition every 500 milliseconds until it returns successfully. A successful return is for ExpectedCondition type is Boolean return true or not null return value for all other ExpectedCondition types.
You can use ExpectedConditions class as you need for the application.
Implicit Wait:
An implicit wait is to tell WebDriver to poll the DOM for a certain amount of time when trying to find an element or elements if they are not immediately available
Code:
driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS);
One thing to keep in mind is that once the implicit wait is set - it will remain for the life of the WebDriver object instance
For more info use this link http://seleniumhq.org/docs/04_webdriver_advanced.jsp
The above code is in Java. Change as your language need.
Ruby code from the docs (click on the 'ruby' button):
wait = Selenium::WebDriver::Wait.new(:timeout => 10) # seconds
begin
element = wait.until { driver.find_element(:id => "some-dynamic-element") }
ensure
driver.quit
end
Which works for me
To add to the above answer, here is how I use implicit and explicit wait in Ruby.
Implicit Wait
I pass this option to Selenium::WebDriver after initializing with a couple of lines like this:
browser = Selenium::WebDriver.for :firefox
browser.manage.timeouts.implicit_wait = 10
Just replace "10" with the number of seconds you'd like the browser to wait for page refreshes and other such events.
Explicit Wait
There are two steps to declaring an explicit wait in Selenium. First you set the timeout period by declaring a wait object, and then you invoke the wait with Selenium::Webdriver's .until method. It would look something like this, in your example:
wait = Selenium::WebDriver::Wait.new(:timeout => 10)
wait.until { browser.find_element(:xpath, "//path/to/picture").displayed? }
This would tell the Webdriver to wait a maximum of 10 seconds for the picture element to be displayed. You can also use .enabled? if the element you're waiting for is an interactive element - this is especially useful when you're working with Ajax-based input forms.
You can also declare an explicit wait period at the start of your script, and then reference the object again whenever you need it. There's no need to redeclare it unless you want to set a new timeout. Personally, I like to keep the wait.until wrapped in a method, because I know I'm going to reference it repeatedly. Something like:
def wait_for_element_present( how_long=5, how, what )
wait_for_it = Selenium::WebDriver::Wait.new(:timeout => how_long )
wait_for_it.until { #browser.find_element(how, what) }
end
(I find it's easier to just declare browser as an instance variable so that you don't have to pass it to the method each time, but that part's up to you, I guess?)
ExpectedConditions isn't supported yet in the Ruby Selenium bindings. This snippet below does the same thing as ExpectedConditions.elementToBeClickable — clickable just means "visible" and "enabled".
element = wait_for_clickable_element(:xpath => xpath)
def wait_for_clickable_element(locator)
wait = Selenium::WebDriver::Wait.new(:timeout => 10)
element = wait.until { #driver.find_element(locator) }
wait.until { element.displayed? }
wait.until { element.enabled? }
return element
end

Ruby webscrape script for GoDaddy

I'm new to Ruby and for my first scripting assignment, I've been asked to write a web scraping script to grab elements of our DNS listings from GoDaddy.
Having issues with scraping the links and then I need to follow the links. I need to get the link from the "GoToSecondaryDNS" js element below. I'm using Mechanize and Nokogiri:
<td class="listCellBorder" align="left" style="width:170px;">
<div style="padding-left:4px;">
<div id="gvZones21divDynamicDNS"></div>
<div id="gvZones21divMasterSlave" cicode="41022" onclick="GoToSecondaryDNS('iwanttoscrapethislink.com',0)" class="listFeatureButton secondaryDNSNoPremium" onmouseover="ShowSecondaryDNSAd(this, event);" onmouseout="HideAdInList(event);"></div>
<div id="gvZones21divDNSSec" cicode="41023" class="listFeatureButton DNSSECButtonNoPremium" onmouseover="ShowDNSSecAd(this, event);" onmouseout="HideAdInList(event);" onclick="UpgradeLinkActionByID('gvZones21divDNSSec'); return false;" useClick="true" clickObj="aDNSSecUpgradeClicker"></div>
<div id="gvZones21divVanityNS" onclick="GoToVanityNS('iwanttoscrapethislink.com',0)" class="listFeatureButton vanityNameserversNoPremium" onmouseover="ShowVanityNSAd(this, event);" onmouseout="HideAdInList(event);"></div>
<div style="clear:both;"></div>
</div>
</td>
How can I scrape the link 'iwanttoscrapethislink.com' and then interact with the onclick to follow the link and scrape content on the following page with Ruby?
So far, I have a simple start to the code:
require 'rubygems'
require 'mechanize'
require 'open-uri'
def get_godaddy_data(url)
web_agent = Mechanize.new
result = nil
### login to GoDaddy admin
page = web_agent.get('https://dns.godaddy.com/Default.aspx?sa=')
## there is only one form and it is the first form on thepage
form = page.forms.first
form.username = 'blank'
form.password = 'blank'
## form.submit
web_agent.submit(form, form.buttons.first)
site_name = page.css('div.gvZones21divMasterSlave onclick td')
### export dns zone data
page = web_agent.get('https://dns.godaddy.com/ZoneFile.aspx?zone=' + site_name + '&zoneType=0&refer=dcc')
form = page.forms[3]
web_agent.submit(form, form.buttons.first).save(uri.host + 'scrape.txt')
## end
end
### read export file
##return File.open(uri.host + 'scrape.txt', 'rb') { |file| file.read }
end
def scrape_dns(url)
site_name = page.css('div.gvZones21divMasterSlave onclick td')
LIST_URL = "https://dns.godaddy.com/ZoneFile.aspx?zone=" + site_name + '&zoneType=0&refer=dcc"
page = Nokogiri::HTML(open(LIST_URL))
#not sure how to scrape onclick urls and then how to click through to continue scraping on the second page for each individual DNS
end
You can't interact with "onclick" because Nokogiri isn't a JavaScript engine.
You can extract the contents and then use that as the URL for a subsequent web request. Assuming doc contains the parsed HTML:
doc.at('div[onclick^="GoToSecondaryDNS"]')['onclick']
will give you the value for the onclick parameter. ^= means "find the word starting with", so that lets us rule out other <div> tags with onclick parameters and returns:
"GoToSecondaryDNS('iwanttoscrapethislink.com',0)"
Using a simple regex [/'(.+)'/,1] will get you the hostname:
doc.at('div[onclick^="GoToSecondaryDNS"]')['onclick'][/'(.+)'/,1]
=> "iwanttoscrapethislink.com"
The rest, such as how to get access to Mechanize's internal Nokogiri document, and how to create the new URL, are left for you to figure out.

Resources