How to confirm a JavaScript popup using Nokgiri or Mechanize - ruby

I'm running a script that will open up on my localhost. My local server is a vulnerable web app test suite.
I'm trying to confirm a XSS popup from a JavaScript alert. For example:
http://127.0.0.1:65412/v?=0.2<script>alert("TEST");</script>
I need to confirm the popup happened using either Mechanize or Nokogiri. Is it possible to confirm that the popup is there with Nokogiri or Mechanize?
For example:
def page(site)
Nokogiri::HTML(RestClient.get(site))
end
puts page('http://127.0.0.1:65412/v?=0.2<script>alert("TEST");</script>')

Nokogiri, and Mechanize because it is built on top of Nokogiri, can parse the HTML and return the <script> tag's contents. The tag's content is text so at that point it's necessary to look inside the text to find what you want:
require 'nokogiri'
doc = Nokogiri::HTML(<<EOT)
<html>
<head>
<script>alert("TEST");</script>
</head>
</html>
EOT
script_content = doc.at('script').content # => "alert(\"TEST\");"
It's easy to check to see if a sub-string exists at that point:
script_content['alert("TEST");'] # => "alert(\"TEST\");"
or:
!!script_content['alert("TEST");'] # => true
Note: It's not possible with Nokogiri, or Mechanize, to tell if a pop-up occurred as that'd happen inside a browser as it runs the JavaScript. Neither Nokogiri or Mechanize understand or interpret JavaScript. Only a tool like Watir or that interprets JavaScript could do that.

Definitely not, and that's because neither Mechanize or Nokogiri run Javascript.
Instead, you could use Selenium.
Something like this:
require 'selenium-webdriver'
class AlertChecker
Driver = Selenium::WebDriver.for :firefox
def initialize(url)
Driver.navigate.to url
end
def raise_alert(text)
Driver.execute_script "alert('#{text}')"
self
end
def safely_get_alert
begin
Driver.switch_to.alert
rescue Selenium::WebDriver::Error::NoAlertOpenError
end
end
end
Usage:
alert_checker = AlertChecker.new("http://my.website")
alert = alert_checker.safely_get_alert
# => nil
alert_checker.raise_alert("hack")
alert = alert_checker.safely_get_alert
puts alert.text
# => 'hack'
# As far as I'm aware Selenium doesn't have a built-in way
# to tell you if it's an alert, confirm, or prompt.
# But you know it's a prompt, for example, you could also send
# keys before accepting or dismissing
alert.accept
alert = alert_checker.safely_get_alert
# => nil
There are some tricky things with Selenium's handling of alerts, though.
There's no way for your code to detect the type (prompt, confirm, or alert) without using something like rescue or try. Everything is reached through switch_to.alert.
Also, if your browser has an alert open you cannot run any subsequent commands unless you handle alert. Say you try and navigate.to while the alert is open; you'd get an error along the lines of You didn't handle the alert and your navigate.to command would have to be rerun. When this error is raised, the alert object will be lost as well.
It's a little unappealing to use rescue as a control structure in this way but I'm not aware of any other option

Related

Browser.back is not working

Using watir, I've written scripts to check multiple links are being directed to the right page as below.
Links= ["Link", "Link1"]
Links.each do |LinkValue|
#browser.link(:text => LinkValue).wait_until_present.click
fail unless #browser.text.include?(LinkValue)
#browser.back
end
What I am trying is:
maintaining Linktext in an array
iterating with each linktext
verify
navigate to the previous page to start verifying with next linktext.
But the script is not working. It is not executing after first value and also not navigating back.
The following scrip working for me
require 'watir'
browser = Watir::Browser.new(:firefox) # :chrome also work
browser.goto 'https://www.google.com/'
browser.link(text: 'Gmail').wait_until_present.click
sleep(10)
browser.back
sleep(10)
You are calling Kernel::Fail, which will raise an exception if the condition isn't satisfied.
In this case, it looks like you are expecting that the destination page will contain the same link text that was clicked on the originating page. If that's not true, then the script will raise an exception and terminate.
Here's a contrived "working" example (which only "works" because the link text exists on both originating and destination pages):
require 'watir'
b = Watir::Browser.new :chrome
b.goto "http://www.iana.org/domains/reserved"
links = ["Overview", "Root Zone Management"]
links.each do |link|
b.link(:text => link).click
fail unless b.text.include? link
b.back
end
b.close
Some observations:
I wouldn't use fail here. You should investigate a testing framework like Minitest or rspec, which have assertion methods for validating application behavior.
In ruby, variables (and methods and symbols) should be in snake_case.

How to scrape pages which have lazy loading

Here is the code which i used for parsing of web page.I did it in rails console.But i am not getting any output in my rails console.The site which i want to scrape is having lazy loading
require 'nokogiri'
require 'open-uri'
page = 1
while true
url = "http://www.justdial.com/functions"+"/ajxsearch.php?national_search=0&act=pagination&city=Delhi+%2F+NCR&search=Pandits"+"&where=Delhi+Cantt&catid=1195&psearch=&prid=&page=#{page}"
doc = Nokogiri::HTML(open(url))
doc = Nokogiri::HTML(doc.at_css('#ajax').text)
d = doc.css(".rslwrp")
d.each do |t|
puts t.css(".jrcw").text
puts t.css("span.jcn").text
puts t.css(".jaid").text
puts t.css(".estd").text
page+=1
end
end
You have 2 options here:
Switch pure HTTP scraping to some tool which supports javascript evaluation, such as Capybara (with proper driver selected). This can be slow, since you're running headless browser under the hood plus you'll have to set some timeouts or figure another way to make sure the blocks of text you're interested in are loaded before you start any scraping.
Second option is to use Web Developer console and figure out how those blocks of text are loaded (which AJAX calls, their parameters and etc.) and implement them in your scraper. This is more advanced approach, but more performant, since you won't make any extra work, like you've done in option 1.
Have a nice day!
UPDATE:
Your code above doesn't work, because the response is HTML code wrapped in JSON object, while you're trying to parse it as a raw HTML. It looks like this:
{
"error": 0,
"msg": "request successful",
"paidDocIds": "some ids here",
"itemStartIndex": 20,
"lastPageNum": 50,
"markup": 'LOTS AND LOTS AND LOTS OF MARKUP'
}
What you need is unwrap JSON and then parse as HTML:
require 'json'
json = JSON.parse(open(url).read) # make sure you check http errors here
html = json['markup'] # can this field be empty? check for the json['error'] field
doc = Nokogiri::HTML(html) # parse as you like
I'd also advise you against using open-uri since your code may become vulnerable if you use dynamic urls because of the way open-uri works (read the linked article for the details) and use good and more feature-wise libraries such as HTTParty and RestClient.
UPDATE 2: Minimal working script for me:
require 'json'
require 'open-uri'
require 'nokogiri'
url = 'http://www.justdial.com/functions/ajxsearch.php?national_search=0&act=pagination&city=Delhi+%2F+NCR&search=Pandits&where=Delhi+Cantt&catid=1195&psearch=&prid=&page=2'
json = JSON.parse(open(url).read) # make sure you check http errors here
html = json['markup'] # can this field be empty? check for the json['error'] field
doc = Nokogiri::HTML(html) # parse as you like
puts doc.at_css('#newphoto10').attr('title')
# => Dr Raaj Batra Lal Kitab Expert in East Patel Nagar, Delhi

Use with Excel data to display on Dashing Dashboard?

I'm trying to get an example of the following code from github that looks to be a dead topic for my Linux/Ubuntu install. I have been trying to scrape data from my company intranet using "mechanize" see stack question for details. Since I'm not smart enough to figure a way around my login issue I thought I would try and feed data from an excel sheet as a work around until I can figure out the mechanize route. Once again I'm not smart enough to get the provided code to work on Linux because I'm getting the following error:
`kqueue=': kqueue is not supported on this platform (EventMachine::Unsupported)
If I'm understanding correctly from the information provided in the original source, the problem is that kqueue isn't supported in Linux. The OP states that inotify is an alternative but I've had no luck finding a similar example using it to display Excel in a widget.
Here is the code that is shown on GitHub and would like help converting it to work on Linux:
require 'roo'
EM.kqueue = EM.kqueue?
file_path = "#{Dir.pwd}/spreadsheet.xls"
def fetch_spreadsheet_data(path)
s = Roo::Excel.new(path)
send_event('valuation', { current: s.cell(1, 2) })
end
module Handler
def file_modified
fetch_spreadsheet_data(path)
end
end
fetch_spreadsheet_data(file_path)
EM.next_tick do
EM.watch_file(file_path, Handler)
end
Okay, so I was able to get this working and to display my data on a Dashing Dashboard widget by doing the following:
First: I uploaded my spreadsheet.xls to the root directory of my dashboard.
Second: I replaced the /jobs/sample.rb code with:
#!/usr/bin/env ruby
require 'roo'
SCHEDULER.every '2s' do
file_path = "#{Dir.pwd}/spreadsheet.xls"
def fetch_spreadsheet_data(path)
s = Roo::Excel.new(path)
send_event('valuation', { current: s.cell('B',49) })
end
module Handler
def file_modified
fetch_spreadsheet_data(path)
end
end
fetch_spreadsheet_data(file_path)
end
Third: Make sure the /widgets/number is in your dashboard "this is part of the sample install".
Fourth: Add the following code to your /dashboards/sample.erb file "this is part of the sample install as well".
<li data-row="1" data-col="1" data-sizex="1" data-sizey="1">
<div data-id="valuation" data-view="Number" data-title="Current Valuation" data-prefix="$"></div>
</li>
I used this source to help me better understand how Roo works. I tested my widget by changing my values and re-uploading the spreadsheet.xls to server and seen instant changes on my dashboard.
Hope this helps someone and I'm still looking for help to automate this process by scraping the data. Reference this if you can help.
Thanks for sharing this code sample. I did not manage to make it work in my environment (Raspberry/Raspbian) but after some efforts I managed to come up something that works -- at least for me ;)
I had never worked with Ruby before this week, so this code may be a bit crappy. Please accept apologizes.
-- Christophe
require 'roo'
require 'rubygems'
require 'rb-inotify'
# Implement INotify::Notifier.watch as described here:
# https://www.go4expert.com/articles/track-file-changes-ruby-inotify-t30264/
file_path = "#{Dir.pwd}/datasheet.csv"
def fetch_spreadsheet_data(path)
s = Roo::CSV.new(path)
send_event('csvdata', { value: s.cell(1, 1) })
end
SCHEDULER.every '5s' do
notifier = INotify::Notifier.new
notifier.watch(file_path, :modify) do |event|
event.flags.each do |flag|
## convert to string
flag = flag.to_s
puts case flag
when 'modify' then fetch_spreadsheet_data(file_path)
end
end
end
## loop, wait for events from inotify
notifier.process
end

I'm using the Selenium Webdriver gem to try to click on the facebook chat bar, sometimes it work and sometimes it doesn't

I'm using the Selenium Webdriver gem to try to click on the facebook chat bar, sometimes it work and sometimes it doesn't. When it does not work it returns the Selenium Element not visible error, but it clearly is visible. I'm not sure what's wrong with my code.
require 'selenium-webdriver'
driver = Selenium::WebDriver.for :chrome # instantiates a google chrome session
driver.navigate.to 'https://www.facebook.com/' # takes you to facebook.com
emailBar = driver.find_element(:id,"email") #finds email input bar
passwordBar = driver.find_element(:id,"pass") #find password input bar
loginButton = driver.find_element(:id,"u_0_n") #finds login button
emailBar.send_keys "austinspreadsheet#gmail.com" # puts in the email
passwordBar.send_keys "YOURPASSWORD" # puts in the password
loginButton.click # clicks the login button
#THIS IS THE CODE BLOCK THAT DOES NOT WORK
links = driver.find_elements(:class,"fbNubButton") # finds the chat bar
#driver.manage.timeouts.page_load = 10
links[0].click # opens the chat bar
links[1].click # NOTE that sometime it clicks and sometimes it doesn't but if you click both chat box classes it usually works, so the error is ok
I have tried not clicking both chat links and it works less when I do that.
I am using Selenium with Python. In case like yours the issue is related to waiting until all the elements in the page are full loaded.
The basic behavior of Selenium offers you Explicit and Implicit Waits. So basicly you can force the system to wait a default number of second or wait until an element is loaded.
From Selenium documentation (http://docs.seleniumhq.org/docs/04_webdriver_advanced.jsp)
Explicit wait
An explicit waits is code you define to wait for a certain condition to occur before proceeding further in the code. The worst case of this is Thread.sleep(), which sets the condition to an exact time period to wait. There are some convenience methods provided that help you write code that will wait only as long as required. WebDriverWait in combination with ExpectedCondition is one way this can be accomplished.
require 'rubygems' # not required for ruby 1.9 or if you installed without gem
require 'selenium-webdriver'
driver = Selenium::WebDriver.for :firefox
driver.get "http://somedomain/url_that_delays_loading"
wait = Selenium::WebDriver::Wait.new(:timeout => 10) # seconds
begin
element = wait.until { driver.find_element(:id => "some-dynamic-element") }
ensure
driver.quit
end
Implicit wait
An implicit wait is to tell WebDriver to poll the DOM for a certain amount of time when trying to find an element or elements if they are not immediately available. The default setting is 0. Once set, the implicit wait is set for the life of the WebDriver object instance.
require 'rubygems' # not required for ruby 1.9 or if you installed without gem
require 'selenium-webdriver'
driver = Selenium::WebDriver.for :firefox
driver.manage.timeouts.implicit_wait = 10 # seconds
driver.get "http://somedomain/url_that_delays_loading"
element = driver.find_element(:id => "some-dynamic-element")
The answer that aberna gives you on this thread has a lot of great information but it isn't going to solve your issue. If you use the Explicit wait method that aberna suggests, you also probably need to make sure the element is visible. Using .findElements on its own doesn't guarantee clickability/visibility . You could try to use expectedconditions .visibilityOfElementLocated which will also check for visibility as well as presence.
Or, alternatively, you can check for presence of the element on the DOM using .findElement but then use the expectedconditions.visibilityOf to check for the visibility part of it.
I am using sleep(5) before run main logic
I was facing the same issue. Solution that worked for me was to maximise my browser window. This solved many of failing specs.
Capybara.current_session.driver.browser.manage.window.maximize

How to disable JavaScript in Capybara

I'm using Capybara to fill in a form and download the results.
It's a bit slow when filling in the form, and I want to check if JavaScript is the culprit.
How do I turn off JavaScript?
The Ruby code was something similar to, but not the same as, the following (the following won't reproduce the error message, but it is somewhat slow).
require "capybara"
url = "http://www.hiv.lanl.gov/content/sequence/HIGHLIGHT/highlighter.html"
fasta_text = [">seq1", "gattaca" * 1000, ">seq2", "aattaca" * 1000].join("\n")
session = Capybara::Session.new(:selenium)
# Code similar to this was run several times
session.visit(url)
session.fill_in('sample', :with => fasta_text)
session.click_on('Submit')
And the error I was getting (with my real code, but not the code I have above) was
Warning: Unresponsive script
A script on this page may be busy, or it may have stopped responding.
You can stop the script now, open the script in the debugger, or let
the script continue.
Script: chrome://browser/content/tabbrowser.xml:2884
I wasn't running Capybara as part of a test or as part of a spec.
To confirm that the code I wrote currently has JavaScript enabled (which is something I want to disable), doing
url = "http://www.isjavascriptenabled.com"
session = Capybara::Session.new(:selenium)
session.visit(url)
indicates that JavaScript is enabled.
Capybara only uses JavaScript if you've specified a javascript_browser:
Capybara.javascript_driver = :poltergeist
And if you've specified js: true as metadata in your spec:
context "this is a test", js: true do
Check for both of those things. If they're not there and the test is not running in a browser or using Poltergeist, then it's probably not using JavaScript.

Resources