Clicking link with JavaScript in Mechanize - ruby

I have this:
<a class="top_level_active" href="javascript:Submit('menu_home')">Account Summary</a>
I want to click that link but I get an error when using link_to.
I've tried:
bot.click(page.link_with(:href => /menu_home/))
bot.click(page.link_with(:class => 'top_level_active'))
bot.click(page.link_with(:href => /Account Summary/))
The error I get is:
NoMethodError: undefined method `[]' for nil:NilClass

That's a javascript link. Mechanize will not be able to click it, since it does not evaluate javascript. Sorry!
Try to find out what happens in your browser when you click that link. Does it create a POST or GET request? What are the parameters that are sent to the server. Once you know that, you can emulate the same action in your Mechanize script. Chrome dev tools / Firebug will help out.
If that doesn't work, try switching to a library that supports javascript evaluation. I've used watir-webdriver to great success, but you could also try out phantomjs, casperjs, pjscrape, or other tools

The first 2 should have worked so try this, print out the hrefs to make sure it's really there:
puts page.links.map(&:href)
Remember that just because you can see it in your browser does not mean it appears in the response. It could have been sent as an ajax update.
Also you can just do this which I think is cleaner:
page.link_with(:href => /menu_home/).click
However I don't think clicking that link will do what you want since it's javascript.

Here's a way to handle it. Assume your page returns this content:
puts page.body
<HTML><SCRIPT LANGUAGE="JavaScript"><!--
top.location="http://www.example.com/pages/myaccount/dashboard.aspx?";
// --></SCRIPT>
<NOSCRIPT>Javascript required.</NOSCRIPT></HTML>
We know it's coming so we know what to check for:
link_search = %r{top.location="([^"]+)"}
js_link = page.body.match(link_search)[1]
page = agent.get(js_link)

Related

ajax returns page source, not the message

All the answers I saw here or elsewhere on Google were with jquery. This is not jquery.
I send an ajax string to a php file.
The php, among other things, formulates a message string which I echo
back to the client.
The returned string is put up in the client as an alert.
The form is then reset.
The problem is that when I do this it puts up as much of the page source that the alert can handle. If I open developer tools to look at the return, it puts the message up correctly, not the page source. Here is the return snippet in my ajax:
ajaxRequest.onreadystatechange = function(){
if(ajaxRequest.readyState == 4){
alert(ajaxRequest.responseText);
document.getElementById("thisForm").reset();
}
}
The php file does a simple echo of a text string.
What is it about developer tools that makes this run correctly and why doesn't it print out the message in the alert when developer tools is not there?
When I run the backend php by itself, with or without developer tools, it displays the message properly.
Does anyone have any ideas?
More information: I tried to replace the alert and reset with a
display.innerHTML=ajaxRequest.responseText where display is a javascript object formed from getElementById("ajaxReturn") of a "div id="ajaxReturn". It didn't work. When I tried developer tools, it showed the network response text as being the page source.
I also added && this.status == 200 to the if statement. No change.
The problem is solved. I am not deleting this because it might help some other poster who runs into the same problem. I launched the AJAX with an onclick to a javascript function called ajaxFunction(). The html entity containing the onclick had an href="#" in it. Removing that href solved the problem.
I had the exact same issue and my cause was related to having an extra slash in my URL.
Lets say my URL was:
https://example.com/index.php
I had a wrong link as follows:
https://example.com/index.php/
On both instances my server loads the page,
But the Ajax shows the page source as response for:
https://example.com/index.php/
But works fine for:
https://example.com/index.php
The ajax is essentially posting to index.php/ajaxpage.php which then responds with whats on index.php instead of whats on ajaxpage.php

Mechanize 'link_with' producing a different URL

I accessed a page that has this link:
<a class="portletpage-portlet-title is-active" tabindex="0" title="Registration" data-ppid="registration_WAR_registration" href="#registration">Registration</a>
The page is encrypted with SSL. The HTML attribute href is #registration. I am trying to follow this link get to the URL:
www.redacted.com/#registration
Here is my code:
agent.get('*redacted*'). do |page|
page.form_with(:action => '*redacted*') do |f|
f.field_with(:id => 'username').value = get_username()
f.field_with(:id => 'password').value = get_password()
end.click_button
agent.page.link_with(:text => 'Registration').click
When it clicks on the link, it produces the following error:
`fetch': 404 => Net::HTTPNotFound for https://*redacted*/group/1403104853945/academics?p_p_id=registration_WAR_uofsregistration&p_p_state=maximized -- unhandled response (Mechanize::ResponseCodeError)
from /home/mike/.rvm/gems/ruby-2.4.1/gems/mechanize-2.7.5/lib/mechanize.rb:464:in `get'
from /home/mike/.rvm/gems/ruby-2.4.1/gems/mechanize-2.7.5/lib/mechanize.rb:348:in `click'
from /home/mike/.rvm/gems/ruby-2.4.1/gems/mechanize-2.7.5/lib/mechanize/page/link.rb:30:in `click'
from u-of-s-scraper.rb:34:in `<main>'
and comes up with the URL:
www.redacted.com/group/1403104853945/academics?p_p_id=registration_WAR_uofsregistration&p_p_state=maximized
I'm not sure where Mechanize is getting the URL. The link has an attribute data-ppid, which appears to be contributing to the URL. Can anyone provide some insight?
It turns out that the page is written using Liferay's Portlets. Unfortunately, Portlets are not directly URL accessible, so I am currently investigating a different means of scraping the page - potentially with Selenium or PhantomJS.
data-ppid is a data attribute, which is supposed to be handled by JavaScript. The change of the URL is probably due to some Javascript code on the client side (and a redirect on the server side).
Links that start with # are "named links" or "bookmark links" - they don't go anywhere, just jump you somewhere on the page.
In other words, there's no reason to ever "follow" a link like that with mechanize.

Ruby Watir WebDriver Net::ReadTimeout

I am trying to use Watir to get the source code of Facebook after I authenticate using Watir. It gives this specific error.
/.rvm/rubies/ruby-2.0.0-p247/lib/ruby/2.0.0/net/protocol.rb:158:in `rescue in rbuf_fill': Net::ReadTimeout (Net::ReadTimeout)
I believe that because there are too many AJAX requests in the homepage, webdriver detects it as the page is not fully loaded. So after I logged in, I did this:
p "starts"
Watir::Wait.until {
browser.div(:'class' => '_586i').exists?
}
p "finishes"
But after it prints "starts" then it gives a timeout error, and doesn't get the source code of the website.
I've been getting this error for some websites quite a lot after I try to, eg, browser.button.click that is redirecting to another page heavily loaded with Ajax. I found this:
browser.execute_script('document.getElementsByTag('button')[0].click()')
sleep 10
with adjusted sleep (or, much better, .wait_until_present) helps.
You can force the browser to wait until all ajax calls has been loaded with
sleep(1) until browser.execute_script("return jQuery.active") == 0

Firefox Ajax post web console summary says 'undefined'

I use Ajax extensively in my JavaScript code. Today I added an Ajax call to a page and nothing happens. The Firefox web console shows a result of "undefined". The exact log entry is:
[11:15:50.733] POST http://mastersw.com/theme/test9.php [undefined 78ms]
(I had to modify the URL to satisfy the editor rules here.)
When I click on the log entry, I see a message dialog with no response. Everything else is correct in the message.
I have checked the Apache logs and there is no sign that the post request got to the server. I use my own JavaScript library Ajax routines. They work everywhere else. I have double checked that the script (test9.php) exists.
I cannot find any documentation on what Firefox means when they say "undefined". Google search returns millions of hits about other things.
The problem seems to be that Firefox is for some reason not completing the post operation and I cannot figure out why.
Update: The JavaScript function invoking the Ajax call was itself being invoked from the onclick handler of an anchor. When I changed the element to a div it worked.
I don't have any idea why Firefox gave an 'undefined' for the post. Chrome complained about an invalid header "Content-Length". Changing to a div fixed this as well.
You need to cancel the click event
$("a").on("click", function (evt) {
evt.preventDefault();
//ajax call here
});

Browser url not returning new url

I am experimenting with using rspec and watir to do some tdd and have come across a problem I can't seem to get past. I want to have watir click a link (target="_blank") and then get the url of the newly loaded page. Watir clicks the link but when I attempt to get the url I receive the old url not the current. Watir docs seem to indicate that the Browser url method will return the current url. I found a blog post that seems to solve this issue by having Watir execute some javascript to get the current url but this isn't working for me. Is there anyway to get the current url from a link click with Watir?
<!-- the html -->
LinkedIn
#The rspec code
it "should load LinkedIn" do
browser.link(:href => "http://www.linkedin.com").click
browser.url.should == "http://www.linkedin.com"
end
The target will load the link in a new browser window, therefore you need to switch to that window to assert the url:
it "should load LinkedIn" do
browser.link(:href => "http://www.linkedin.com").click
browser.window(:title => /.*LinkedIn.*/).use do
browser.url.should == "http://www.linkedin.com"
end
end
See: http://watirwebdriver.com/browser-popups/ for more examples

Resources