Download Content From External URL and save in db with Ruby - ruby

This isn't webservices. I want to pass a url to a controller and then have it fetch the html from that page. Then store the information in a db.
What do you think? How can I accomplish this?

In your controller:
html = %x[curl #{params[:url]}]
That will execute the system curl command and save the result (this is, the content extracted from the url) in the variable html. Then you can make hot cakes with that string if you want to.

yes
hints: http://en.wikibooks.org/wiki/Ruby_Programming/Standard_Library/OpenURI
and then use some ORM or use the mysql drivers directly.

When I read your post, the first thing I thought of was Watir # http://watir.com/
Watir is a family of Ruby libraries
but it supports your app no matter
what technology it is developed in.
They support Internet Explorer on
Windows, Firefox on Windows, Mac and
Linux, Safari on Mac, Chrome on
Windows and Flash testing with
Firefox.
Like other programming languages, Ruby
gives you the power to connect to
databases, read data files and
spreadsheets, export XML, and
structure your code as reusable
libraries. Unlike other programming
languages, Ruby is concise and often a
joy to read.
You can easily grab HTML and then populate it to a database, excel, etc.

Related

Using SQLite in Mac App

I am trying to look for a good tutorial/jump point to use SQLite in MacOSX App. I do have knowledge in iPhone development but never dealt with SQLite before, all my apps were enterprise lever where i talk to RESTFul server to post and get data, and all the sql stuff is at server side.
All my search attempts returned iphone results and some UI wrappers OSX, i guess there are less people out there that code for OSX than iphone :)
I am trying simply to make my app:
When it runs for the first time, checks and create a DB if it does not exist. I prefer to make the code invoke a sql script that will create the db if it does not exist, or if does exist it can check and make sure all tables, FK relations ..etc are correct. (I know how to do that script I just need the how to invoke in cocoa OSX apps)
Basic SQL stuff. INSERT/UPDATES/DELETE?
But before all this, is SQLite3 the correct approach for MAC OSX apps or I should stay with using plist files? Can the user "Normal" mess the state of SQLite3? are there any permissions issues that i have to worry about? I want my users just to launch the app and I will do everything in the background for them (I know I will support 10.8+ for this)?
Depending on your data needs you might consider using Core Data. It's not right for every situation, but it might be a good thing to check out. It can store data in XML, sqlite formats on the backend, so you can pick the right format depending on the data characteristics of your app.
If you know you want SQLite directly, FMDB is a good wrapper around it. I used FMDB a few years ago in a Mac app for a client and it worked pretty well.
Even if FMDB isn't your style reading the source may give you a good example of how the sqlite API works.
If you are an iOS developer then you are aware of Core Data, which is probably a better choice than raw SQLite for Mac Applications.

Editor for end user documentation in C# WinForm app

I'm developing a WinForm app in c# 4.0 and would like other (non-developer) colleagues to contribute writing a context sensitive end-user helpfile. First I thought I could use "HTML Help Workshop" from Microsoft, but it seems outdated (Vista and Windows 7 not supported).
Then I've looked at Sandcastle, but the documentation is lacking and I wonder if it is suitable for non-technical users to write end-user documentation.
So I read about RoboHelp, but it's way to expensive for me.
I'm getting lost in all the information that is available about helpfiles. Can someone help give some best practices or information on what tools to use and what output format I should target (still chm or other).
Great question. I like your idea of non-developers contributing to the end-user documentation.
This idea might motivate users and testers of your application to easily contribute to the documentation.
The first thing that comes to my mind, is using a some sort of wiki engine. You could build a simple function in your WinForm application, that fires up a browser and directs in to the wiki. You could use the context from which it is called to build up an url; e.g. http://dev-wiki.mycompany.com/LoginForm?action=edit. Here the name of the form ("LoginForm") is used in the url of a wiki page.
Alternatively, you could simply use the embedded web browser control for WinForms to access the wiki. That would look something like:
var url = GetWikiUrl(myForm);
browserControl.Navigate(url);
This would be very easy to embed in your application.
In a controlled (office) environment, this would be very easy to set up. In you production environment it might be a bit more difficult, but still doable. It might leverage some end-user contributions too.
For writing documentation, I use sphinx.
It lets you document in plain text and has various output formats (chm, html, pdf etc.).
Some of these (chm, html) can be used as context-sensitive help sources.
However simple, the sphinx user-interface (text editor and make file) might not be suitable for non-technical users.
I would recommend to use Help+Manual for creating CHM documentation. It's similar to MS Word and any PC user can start to contribute doc development after short education.
But this tool isn't free :(

Best method for a webpage to access a mac's peripherals?

I'm building a web-based application that can use ActiveX Controls to print to a Thermal Label Printer (specific to shipping labels) in Windows environments, but I am racking my brain to figure out what the best method would be for OSX. Obviously ActiveX doesn't work on macs...
Any ideas about where to start looking? A flash movie? A firefox add-on? My fingers are tired of googling.
There's no way a vanilla web language will allow you to control peripherals from a webpage under Mac OS.
If you really really need to call that from a webpage and can't afford to make a real application, your best go under Safari would be to build a plugin to use Objective-C from Javascript, and do the heavy-duty work from within your plugin. A similar solution probably exists in Firefox.
Also, as I understand it, your program runs on the client with the printer attached. You could write a server-side script and install it on the Macs, and then have your webpage drive it to do the printing.
My first choice to solve this problem quickly would be to use an enterprise label print server like Loftware or Bartender. But, like you said, they are expensive and you are planning on reselling your product.
My second choice would be to scrap the activex control and build a simple print server. There is no standard control language in the label printer world but if you are going to standardize on a certain class of Zebra printer you would only need to implement one driver at first. I have only ever done this for Datamax printers but I'm sure the process for Zebra printers is similar.
The server takes your label data as input (pallet ID, ship to address, etc), inserts that data into a template (painstakingly crafted in the text based printer control language) and then this label file is sent to the appropriate printer.
My third choice would be the browser based solution you are looking for. IT departments hate that stuff.
You can create an NPAPI plug-in, which will work in Safari, Firefox and other Mac web browsers. You'll need to have the user install the plug-in on their system before it can be used, there's no way to install it automatically.
Can't you just use the JavaScript printing API?

Using the browser for desktop UI

How can I use the browser as a UI for a desktop app? The ways I have come up with so far are...
Use all HTML/Javascript. Problem: Can't access filesystem or just about anything else.
Run a local webserver while the application is in use. Problem: How do I kill it when the user is done? My users are not technical enough to Ctrl+C.
Embed a browser component in a regular GUI. Problem: Embedded browser components tend to be glitchy at best. The support for Javascript/CSS is never as good as it is in a real browser.
...?
The ideal solution would work with any technology. I know there are options like writing Firefox extensions, but I want to have complete freedom in the backend technology and browser independence.
Please note that if you choose to run a local webserver, you're creating a security risk.
Any webpage running on the same machine that knows about your app can send requests to your server using Javascript, and you have no simple and reliable way of knowing what the request came from. (Don't trust the referer header)
Google Desktop, which uses a similar approach, has had several real-world vulnerabilities that allow any webpage to read any file on disk.
There are several ways to protect against this; I would recommend requiring each request to have a auth key which is randomly generated per-machine (and expires at some point), which you could put in the source for the actual pages. XHR protection would prevent malicious websites from reading the auth key, rendering them powerless.
If you are looking for a python Web Server with a Kill link, you could always check CherryPy.
import webbrowser
import cherrypy
import threading
class MyApp:
""" Sample request handler class. """
#cherrypy.expose
def index(self):
return """<html><head><title>An example application</title></head>
<body>
<h1>This is my sample application</h1>
Put the content here...
<hr>
Quit
</body></html>"""
#cherrypy.expose
def exit(self):
raise SystemExit(0)
class MyBGThread(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.start()
def run(self):
cherrypy.tree.mount(MyApp())
cherrypy.quickstart()
myThread = MyBGThread()
webbrowser.open("http://127.0.0.1:8080")
This code is based on the sample from the SingleClickAndRun on the cherrypy website:
http://tools.cherrypy.org/wiki/SingleClickAndRun
Note than in a normal WebApp you would probably use a templating engine and load templates from methods like main.
Something that would be nice would be to embbed a browser control in a gui window and close the server when the app exits.
For the security, you could possibly add an authentication scheme. There are a few that are supported by cherrypy, but you possibly could implement your own too, using tool modules.
I am looking to do the exact same thing (desktop app that uses an up to date HTML5 / CSS3 browser as the desktop app's GUI), only with Ruby (various reasons why I decided to work with Ruby). Its amazing the number of cross platform libraries people have come up with. But yet, few to no one, has done any work on trying to get a web browser to be a desktop app UI. Cross platform issue... well I won't say solved, but I will say several steps in the right direction taken.
To me this would be perfect with the new HTML5 / CSS3 standards coming out. I know it can be done with a web server running locally.
Another way might be like how the guys from “280 North” are doing what they do. They developed Objective-J (an extension of regular JavaScript that mimics how Objective-C extends regular C) and Cappuccino (the Objective-J equivalent of Objective-C’s Cocoa frame work on the MAC). They also developed “Atlas” which is 280 North’s version of Apple’s “Interface Builder” from Xcode, for their Objective-J and Cappuccino frameworks to build Internet Applications. Atlas is actually a Cappuccino web app running on your desktop as a desktop app. In this case they use the Narwhal… a cross platform, general purpose, JavaScript platform for developing JS apps outside of the browser (basically a specialized web server).
If any one can come up with an idea to make “Browser, direct connect to Desktop App” work without the need of a web server co-existing and still get to manipulate the local FS, I to would be very interested… Hmmm... Now that I think about it, I wonder if the new Google Chrome project “Native Client” can be used to do that. NaCL is much like Active X except you are not limited to a windows platform (but will be limited to the Google Chrome browser, at least for now). Only there is added security via Sandboxing, but you can manipulate the local FS… The more I think about it, the more I am beginning to suspect that it can be done.
Any thoughts?
In Windows, you could embed the IE ActiveX control, which uses the same rendering engine as IE. (That's a plus and a minus) You can set the ScriptObject property in your host code and access it in Javascript as window.external to do things that Javascript cannot do.
If you run a local webserver, you could have an exit link in the app that kills the websever.
You did not mention the OS you will need to target. But you might be able to create a program statared web server, then launced the default browser. Wait until the browser is terminated by the user and then shut down the web server.
So for example on windows you can use CreateProcess() to spawn the process
then MsgWaitForMultipleObjects() to wait until it is finished executing.
HTML Applications (HTA, for short) have been around for a while. You can read all about them here. They are basically HTML and Javascript with some extra options to create a window and with access to the local file system. They seem to be exactly what you want. It is Microsoft technology, so this will only work with IE on Windows systems. I've successfully used this as a front-end for a CD-ROM which was used to distribute software to first year students
Another option would be to use Adobe Air. I'm not all that familiar with the technology, but it seems to provide a framework to deploy web pages as desktop applications. I can't post a second link as a guest, but just google it and you'll find it soon enough.
Today, in 2023, you can simply use any installed web browser as GUI using the WebUI library.

How to Programmatically take Snapshot of Crawled Webpages (in Ruby)?

What is the best solution to programmatically take a snapshot of a webpage?
The situation is this: I would like to crawl a bunch of webpages and take thumbnail snapshots of them periodically, say once every few months, without having to manually go to each one. I would also like to be able to take jpg/png snapshots of websites that might be completely Flash/Flex, so I'd have to wait until it loaded to take the snapshot somehow.
It would be nice if there was no limit to the number of thumbnails I could generate (within reason, say 1000 per day).
Any ideas how to do this in Ruby? Seems pretty tough.
Browsers to do this in: Safari or Firefox, preferably Safari.
Thanks so much.
This really depends on your operating system. What you need is a way to hook into a web browser and save that to an image.
If you are on a Mac - I would imagine your best bet would be to use MacRuby (or RubyCocoa - although I believe this is going to be deprecated in the near future) and then to use the WebKit framework to load the page and render it as an image.
This is definitely possible, for inspiration you may wish to look at the Paparazzi! and webkit2png projects.
Another option, which isn't dependent on the OS, might be to use the BrowserShots API.
There is no built in library in Ruby for rendering a web page.
Using Selenium & Ruby is one possibility. You can run Firefox as a headless browser (ie on a server).
Here is the source code for browser shots. http://sourceforge.net/projects/browsershots/files/
If you are using Linux you could use http://khtml2png.sourceforge.net/ and script it via Ruby.
Some paid services to try and automate
http://webthumb.bluga.net/home
http://www.thumbalizr.com
as viewed by.... ie? firefox? opera? one of the myriad webkit engines?
if only it were possible to automate http://browsershots.org :)
Use selenium-rc, it comes with snapshot capabilities.
With jruby you can use SWT's browser library.

Resources