How to Programmatically take Snapshot of Crawled Webpages (in Ruby)? - ruby

What is the best solution to programmatically take a snapshot of a webpage?
The situation is this: I would like to crawl a bunch of webpages and take thumbnail snapshots of them periodically, say once every few months, without having to manually go to each one. I would also like to be able to take jpg/png snapshots of websites that might be completely Flash/Flex, so I'd have to wait until it loaded to take the snapshot somehow.
It would be nice if there was no limit to the number of thumbnails I could generate (within reason, say 1000 per day).
Any ideas how to do this in Ruby? Seems pretty tough.
Browsers to do this in: Safari or Firefox, preferably Safari.
Thanks so much.

This really depends on your operating system. What you need is a way to hook into a web browser and save that to an image.
If you are on a Mac - I would imagine your best bet would be to use MacRuby (or RubyCocoa - although I believe this is going to be deprecated in the near future) and then to use the WebKit framework to load the page and render it as an image.
This is definitely possible, for inspiration you may wish to look at the Paparazzi! and webkit2png projects.
Another option, which isn't dependent on the OS, might be to use the BrowserShots API.

There is no built in library in Ruby for rendering a web page.
Using Selenium & Ruby is one possibility. You can run Firefox as a headless browser (ie on a server).
Here is the source code for browser shots. http://sourceforge.net/projects/browsershots/files/
If you are using Linux you could use http://khtml2png.sourceforge.net/ and script it via Ruby.
Some paid services to try and automate
http://webthumb.bluga.net/home
http://www.thumbalizr.com

as viewed by.... ie? firefox? opera? one of the myriad webkit engines?
if only it were possible to automate http://browsershots.org :)

Use selenium-rc, it comes with snapshot capabilities.

With jruby you can use SWT's browser library.

Related

Profiling Webworkers in Firefox

I am trying to profile a Javascript app with Firefox Quantum. The part that I am most interested in runs inside webworkers. I am not sure if I am doing everything correctly, but I cannot find a way to get any useful data with the built in profiler. All I can see is this:
Is there a hidden feature that can help me look inside the webworker?
For anyone else finding this in the future: https://profiler.firefox.com/ is able to profile Web Workers.
You need to open the Resources drop-down and choose Thread: DOM Worker.
After further looking into this I came to the conclusion that Firefox does not support runtime analysis for web workers at the moment. I will have stick with Chrome for profiling our app.

Desktop application using Firefox WebExtensions

I am working on a XUL desktop application, where I use the browser tag and load a URL in that tag within the desktop application.
However, some websites display as old format and according to Mozilla, XUL is deprecated and will not be useable at the end of 2017. I want to build the application with the latest technology: WebExtensions.
I have searched many examples on the usage of WebExtensions, but all are working within the browser. Can I make a standalone desktop application just like XUL, but using WebExtensions?
If yes, then please give me some hints on how to get started.
If no, is any alternative for the same requirement available?
Webextensions are fairly limited in their scope. Even if there was an application runtime utilising them, you probably wouldn't get much use out of them due to the restrictive isolation from the host system.
Strictly speaking not webextensions, albeit very similar:
The Electron framework/runtime*
Someone at Mozilla is also working on an alternative dubbed "Positron"** though that software's future is uncertain and there is a chance he might abandon it for an entirely new, highly simplified project (at least that's what I gathered from my conversation with him on Github).
*http://electron.atom.io/
**https://github.com/mozilla/positron

Is there any way to automate the testing of flash within web pages using Watir-Webdriver?

I am attempting to test several web pages built in Flex, and need to automate clicking on several videos through the Flash interface. I'm using Ruby and Watir-Webdriver, but I'm not sure how to interact with Flash using them.
Has anyone figured this out? I've tried using Sikuli, but have found it to be a little clunky and not very fast. Any ideas would be greatly appreciated.
I will quote myself:
It is important to say that Watir CAN NOT control browser plugins like
Java applets, Adobe Flash or Microsoft Silverlight.
From https://github.com/zeljkofilipin/watirbook/blob/master/about.md
There is a way though. You can embed javascript into your ruby watir script.
It has worked for me
browser.execute_script <<-JS
Global.videoPlayer.sendEvent("play")
JS
Similarly you can do a pause or stop based on the controller on the player
Enjoy !!

Is it safe to use code from code.jquery.com for long-term application?

I am using Ajax / jquery on a webpage i am designing... in order for it to function, i include (at the top of my page) the javascript at: http://code.jquery.com/jquery-1.4.4.js
This works great and all, but i have a fear that
1) the code might get changed without me knowing, then i encounter problems and try to debug for days / hours before finding that the code at this site changed
2) the website is no longer used / specific code no longer hosted years from now
So would it be safer to save that javascript file onto my server, and access it from there?
You should use either a Microsoft or Google CDN. It will be much faster, it will be cached for a lot of your users and it's guaranteed to be there, as opposed to the jQuery link you include.
http://code.jquery.com is jQuery's CDN (provided by Media Temple). The code at http://code.jquery.com/jquery-1.4.4.js will never change; jQuery will release a new version (which will be at a different URL), if anything needs to change (which happens all the time; version 1.5b was released today).
The jQuery guys know what they're doing, and they setup a CDN so people can easily link to jQuery. They're just as (un)likely to bring down the CDN as Google and Microsoft are at bringing theirs down.
See http://docs.jquery.com/Downloading_jQuery for more information.
Having said that, it would seem the Google hosted version (http://ajax.googleapis.com/ajax/libs/jquery/1.4.4/jquery.min.js), is referenced more in websites; this leads to a small performance advantage as far as your users are concerned, as the file has more chance of being cached.
It's safe, notice the version number? As jQuery is updated then that version number will change.
Of course using a CDN will always mean that it's possible for the content delivery network to go out of business. But that's the case with any non directly controlled server.
You of course could use the Google CDN for jQuery, I highly recommend it.
Relevant:
http://code.google.com/apis/libraries/devguide.html#jquery

How can I check my AJAX for browser capable?

I always have to check each and every browser to see if my website would work. Is there a website where I can check it with?
Update:
I don't really want just screenshots (which what browsershots do), I want to actually test the posting of my script.
You want a web site to check your web site for javascript compatibility? How would you expect it to know how to exercise your interface to trigger the proper interactions? Or are you thinking of it doing some sort of static code analysis? I think you are better off coding against a framework that has solved most of the browser-dependent idiosyncrasies and using it to check for browser capabilities before you use them. jQuery, MooTools, Prototype/Scriptaculous, etc. go a long way in solving these problems for javascript.
Note that you still need to worry about rendering your site, but you already have several answers for how to go about doing that based on web sites. Personally, I just maintain IE/Safari/FF/Opera/Chrome on my workstation and do significant checking in IE/FF and basic checking in Safari/Opera/Chrome.
Even when there exist websites that allow you to see a static snapshot of your site in several browsers, you should really test your page on them yourself, because there can be subtle, and not so subtle, bugs and differences that are only apparent when interacting with the webpage.
You can cover yourself quite a lot by testing in
A Gecko engine browser (Firefox)
A Webkit engine browser (Chrome, Safari, Konqueror)
Opera
AND IE6+
John Resig recommends checking the Yahoo graded browser support documentation.
If you write unit tests for your javascript, you could use testswarm http://testswarm.com
There are multiple options:
http://ipinfo.info/netrenderer/
These site will let you run multiple browsers and version without installing. You only need to install a plugin
http://spoon.net/browsers/
There are plenty of sites, just Google/Bing for browser compatibility check.
http://browsershots.org/ is a good one.
Although most of them just take a snapshot of the site, you might have to do the manual check for things like menus and dynamic content.
BrowserShots might do what you want if you can tell by rendering a particular URL whether or not things will work as expected.
In light of your update, you could still use BrowserShots by creating a page which tests each of your scripts and renders 'pass' or 'fail' as its content depending on whether they work or not.
Failing that, Multiple IE is quite useful for running various versions of IE on one PC which can otherwise be problematic.

Resources