Pass information back from an iframe? - firefox

Right now i'm building a firefox plugin that duplicates some functionality on my website. It takes in an email address and then returns information to the user. The easiest way to do this in the plugin is to use an Iframe and render that super simple form on my website. All of this works great, but to make the plugin really useful, i would like the plugin to have access to the information that the iframe renders, so it can use it in the current window that the user is in.
Is it possible to pass information back through an Iframe in this manner? I know there are quite a few domain access restrictions with Iframes, so any help or insight is appreciated!!

I've done this two ways.
If the iframe is on the same domain as the parent website, you can just, in javascript, access window.parent.
If it isn't, however...I've done a dirty trick. I'll share it here, though, as it may help.
We created a page on the other domain, which would call to window.parent.parent. We put that in a hidden iframe inside the iframed page, and send it a querystring argument or two. It's not pretty, but it gets around cross-domain scripting problems.
This basically means that you have this sort of thing:
admin.example.com
content.example.com - iframe
admin.example.com?contentid=350 - hidden iframe that makes a window.parent.parent call.

Is the point of this whole exercise functional testing of your website? If so, instead of your custom Firefox plugin, consider using Selenium to automate interactions with websites. It works with all major browsers and supports the inspection of page elements you are trying to do (using XPath). It also features a Firefox plugin called Selenium IDE that allows you to conveniently "record" your interactions with a website for automated playback later.

Related

Javascript user request layer

I am currently writing a compatibility layer between browsers and for this I need to ask the user to confirm an action. Currently the only standard way in JavaScript to do this is window.confirm which is synchronous and I do not want to block the whole site. So I would be searching for a library which can display a asynchronous browser-like request (e.g. the ones they use for Geolocation).
EDIT: And similar to the native one I do not need/want the user interaction to be modal. Just displaying and reacting on user input that is all.
I remember having seen such sites, but cannot remember where.
Can someone point me in the right direction?
As a bonus it would be great if it would work an look like the native ones in IE, FF and Opera.
the jQueryUI library has a dialog plugin that can be made modal. Since it is JS, it does not block the rest of the page execution.

when to use AJAX and when not to use AJAX in web application

We have web applications elgifto.com, roadbrake.com in which we used AJAX at many places, especially to update major portions of a page. All the important functionality of elgifto.com was implemented using AJAX. Now we realize a few issues due to AJAX implementation.
All the content implemented using
AJAX is not available to the SEO
bots and it is hurting the page rank
of our site.
Users will not be able to bookmark
some of the pages as they are always
available through AJAX.
When we want to direct the user from
one page through an anchor link to
another page having AJAX, we find it
difficult.
So now we are thinking of removing AJAX for these pages and use it only for small functionality such as something similar to marking a question as favorite in SO. So before going ahead and removing, we want to know expert's opinion on this. Thanks.
The problem is not "AJAX" per se, but your implementation of it. Just as a for instance, you can fix the 'bookmark' problem like google maps does it: provide a generated link for each state of your webapp.
SEO can befixed by supplying various of these state-links to the crawlers, either organically trough links in your site, or by supplying a list (sitemap).
If you implement 2, you can fix 1 and 3 with those links.
In the end you must figure out if the effort is worth it, and if you are not overusing AJAX ofcourse, but the statements you've made are not set in stone at all.
I'm costantly developing ajax based websites, with no problems for SEO at all. You just have to use it in the best possible way.
For example, I have a website with normal links pointing to normal webpages (PHP pages), this for normal navigation if a user doesn't have JS enabled. But if a user has JS enabled, a script will change the links behavior, only fetching the content of the page needed.
This way you still have phisycal separated webpages with all their content, which will be indexed as normal.

Automate website log-in and form filling?

I'm trying to log in to a website and save an HTML page automatically (I want to be able to do this on a regular time interval). From the surface, this is a typical modern website where, if the user navigates directly to a "locked" URL, a log-in form pops up, and after logging in, the user is redirected to the intended page.
I gave mechanize a shot (http://wwwsearch.sourceforge.net/mechanize/) but it wasn't finding some form elements which were needed for login (hidden elements that have some values put in by a javascript function that runs when the user clicks the "log in" button).
I played a bit with the "web browser" control in .NET but quickly lost interest because I couldn't even get it to submit a query on the Google page.
I don't care what the language is; I'll learn it to solve this problem. At a minimum it has to work in Windows.
A simple example, say, typing in a query into the Google search box would be a great bonus.
In my experience, the most reliable way is to use javascript. It works well in .Net. To test, browse to the following addresses one after another in Firefox or Internet Explorer:
http://www.google.com
javascript:function f(){document.forms[0]['q'].value='stackoverflow';}f();
javascript:document.forms[0].submit()
That performs a search for "stackoverflow" on Google. To do it in VB .Net using the webbrowser control, do this:
WebBrowser1.Navigate("http://www.google.com")
Do While WebBrowser1.IsBusy OrElse WebBrowser1.ReadyState <> WebBrowserReadyState.Complete
Threading.Thread.Sleep(1000)
Application.DoEvents()
Loop
WebBrowser1.Navigate("javascript:function%20f(){document.forms[0]['q'].value='stackoverflow';}f();")
Threading.Thread.Sleep(2000) 'wait for javascript to run
WebBrowser1.Navigate("javascript:document.forms[0].submit()")
Threading.Thread.Sleep(2000) 'wait for javascript to run
Notice how the space in the URL is converted to %20. I'm not certain if this is necessary but it can't hurt. It is important that the first javascript be in a function. The calls to Sleep() are to wait for Google to load and also for the javascript stuff. The Do While Loop might run forever if the page fails to load so for automation purposes have a counter that will timeout after, say, 60 seconds.
Of course, for Google you can just navigate directly to www.google.com?q=stackoverflow but if your site has hidden input fields, etc, then this is the way to go. Only works for HTML sites - flash is a whole other matter.
If I understand you right, you want to log in to only one webpage, and that form always stays the same. You could either reverse engineer the java script, or debug it via a javascript debugger in the browser (e.g. firebug for firefox). Or you can fill in the form in your browser and look at the http request via a network packet sniffer. Once you have all required form data to submit, you can do the same with your program (thats what I did the last time I had a pretty similar task to do). dont forget to store all cookie data you requested back from the webserver and send it with the next request, to 'stay logged in'.
Its being already discussed here.
Basically its gist is you can use selenium, an open source web automation tool, which has api library available in various languages like java, ruby, etc.
Neoload can handle the form filling with authentication, assuming you don't want to collect data, just perform actions. It's a web stress tool, so it's not really meant to be used as a time-based service, but you COULD just leave it running.
I've used Ruby and Watir (a web app testing suite) for something similar, but it was a very small task (basically visiting URLs from a text file and downloading an image).
There's also an extension called iMacros that can do some automation, but I'm not personally familiar with it (just aware of it).
"I'm trying to log in to a website and save an HTML page automatically"
SAVEAS TYPE=HTM FOLDER=C: FILE=page.html
https://addons.mozilla.org/en-US/firefox/addon/imacros-for-firefox/?src=search
This commands played in iMacros addon will save the page on C: drive and name it page.html
Also,
URL GOTO=www.website.com
Goes on the particular website you want to save. You can also use scripting in iMacros and set different websites in macro.

Javascript: achieving the Google Ad AJAX effect

I need to create a portable script to give to others to implement on their websites that will dynamically show content from my database (MySQL).
I know AJAX has a cross-site problem, but it seems that Google's ad's somehow manage the effect in a cross-browser / cross-site fashion.
Knowing that I have to give people a simple cut/paste snippet to put in their website...how can I achieve this? How did Google?
They use an <iframe>, so the ad is served from their server, and can talk to their database. I'm not actually sure that they use any sort of AJAX from their ads, though; they appear to just be mostly static content, with a few scripts for tweaking the formatting (which are optional, since they want their ads to be visible even if users have JS turned off).
Remember, you can always look into this on your own, and see what they did. On Firefox, use Firebug to explore the html, css, and scripts on a site. On WebKit based browsers (Safari, Chrome, and others), you can use the Web Inspector.
Google's ad code is loaded via a script tag that calls a remote javascript file. The AJAX restrictions that are generally enforced with xmlhttp, iframe, and similar AJAX requests don't apply when it comes to loading remote javascript files.
Once you've loaded the javascript file, you can create iframes in your page that link back to the actual hosted content on your server (and feed them any data about the current page that you wish).
jQuery has built in support for jsonp in their ajax calls. You may want to lookin in to using that if you are really needing to use ajax.
http://api.jquery.com/
http://docs.jquery.com/Ajax
You don't need iFrames and you don't need AJAX. It's really, really simple!
You pull in a remote JS file that is actually a constructed file from php/asp/whatever. In your JS file you have a document.write script that writes the content. It's that simple.
We do this all the time with media stored on separate sites. Here's an example.
YOUR SERVER: file.php (which outputs js)
<script>
document.write("I'm on a remote server");
</script>
OTHER SITE:
<script src='http://www.yourserver.com/file.php'></script>
And it will output the content generated by the script. To make the content customized you can put in script vars above the script call that will adjust what your file pulls out. From there it's pretty straightforward.
I realize this question is a year old, but I've written a library that can help with the document.write part of the problem (whether this is a TOS violation, I don't know) writeCapture.js. It's pretty simple:
$('#ads').writeCapture().html('<script src="whatever-your-adsense-code-is"> </script>');
The example uses jQuery, but you can use it standalone as well.

Screen scraping an ASP.NET web page to retrieve data displayed in the grid view

I am using RUBY to screen scrap a web page (created in asp.net) which uses gridview to display data. I am successfully able to read the data displayed on page-1 of the grid but unable to figure out how I can move to the next page in the grid to read all the data.
Problem is the page number hyperlinks are not normal hyperlinks (with URL) but instead are javascript hyperlink which causes postback to the same page..
An example of the hyperlink:-
6
I recommend using Watir, a ruby library designed for browser testing, if you're already using ruby for processing. For one thing, it gives you a much nicer interface to the DOM elements on the page, and it makes clicking links like this easier:
ie.link(:text, '6').click
Then, of course you have easier methods for navigating the table as well. It's easy enough to automate this process:
1..total_number_of_pages.each do |next_page|
ie.link(:text, next_page).click
# table processing goes here
end
I don't know your use case, but this approach has its advantages and disadvantages. For one thing, it actually runs a browser instance, so if this is something you need to frequently run quietly in the background in completely automated way, this may not be the best approach. On the other hand, if it's ok to launch a browser instance, then you don't have to worry about all that postback nonsense, and you can just click the link as if you were a user.
Watir: http://wtr.rubyforge.org/
You'll need to figure out the actual URL.
Option 1a: Open the page in a browser with good developer support (e.g. firefox with the web development tools) and look through the source to find where _doPostBack is defined. Figure out what URL it's constructing. Note that it might not be in the main page source, but instead in something that the page loads.
Option 1b: Ditto, but have ruby do it. If you're fetching the page with Net:HTTP you've got the tools to find the definition of __doPostBack already (the body as a string, ruby's grep, and the ability to request additional files, such as those in script tags).
Option 2: Monitor the traffic between a browser and the page (e.g. with a logging proxy) to find out what the URL is.
Option 3: Ask the owner of the web page.
Option 4: Guess. This may not be as bad as it sounds (e.g. if the original URL ends with "...?page=1" or something) but in general this is the least likely to work.
Edit (in response to your comment on the other question):
Assuming you're using the Net:HTTP library, you can do a postback by just replacing your get with a post, e.g. my_http.post(my_url) instead of my_http.get(my_url)
Edit (in response to danieltalsky's answer):
watir may be a really good solution for you (I'm kicking myself for not having thought of it), but be aware that you may have to manually fire the event or go through other hoops to get what you want. As a specific gotcha, with any asynchronous fetch like this you need to make sure that the full response has come back before you scrape it; that isn't a problem when you're doing the request inline yourself.
You will have to perform the postback. The data is pass with a form POST back to the server. Like Markus said use something like FireBug or the Developer Tools in IE 8 and fiddler to watch the traffic. But honestly this is a web form using the bloated GridView and you will be in for a fun adventure. ;)
You'll need to do some investigation in order to figure out what HTTP request the javascript execution is performing. I've used the Mozilla browser with the Firebug plugin and also the "Live HTTP Headers" plugin to help determine what is going on. It will likely become clear to you which requests you will need to make in order to traverse to the next page. Make sure you pay attention to any cookies getting set.
I've had really good success using Mechanize for scraping. It wraps all of the HTTP communication, html parsing and searching(using Nokogiri), redirection, and holding onto cookies. But it doesn't know how to execute Javascript, which is why you will need to figure out what http request to perform on your own.

Resources