Automatically filling in web query forms and returning data (for a newbie) - data-extraction

I am whatever comes before 'novice' in programming. I have written macros in VBA for Excel, and used Visual Studio a bit when I was younger, but that's about it.
My problem: To produce the reports I need at work, I have to extract data that is stored behind user-friendly query forms on my company's intranet. I have automated every other part of the report except this. I would like to write a program to access this webpage and fill in query forms for me with preset values, and then return the data that is output. I had a discussion with a computer scientist friend of mine who said this was easy to do with Haskell (his language of choice). However I'm no veteran so I'd like to learn a language a bit nearer to my level... Python seems a good bet.
My question: is it possible to do this type of data extraction with Python? How difficult would it be, and what is a good resource to teach myself about it?
I've done some research and come up with Scrapy, but I can't tell whether it fills in forms. Also, if there are other languages more suited to this, I'd be glad to hear it.

The easiest way is just to use urllib2. Usually, arguments to your forms are transferred to the servers so that you can see them in the URL as ?foo=bar&bla=blah. You can generate arguments to your forms with urllib2.urlencode:
Python and urllib2: how to make a GET request with parameters.
For a newbie, you formulate your thoughts very clear, congrats.

I would start by reading some basic tutorials on HTTP. A form is basically just a visual way to collect data. The meat of the form is the request your browser makes with that form data.
So "filling in forms" is really not necessary (it may be though, hopefully its not because it CAN get complicated). What is necessary is learning what request that form actually makes to the browser and emulating it. A super easy way to do this is with chrome developer tools or a firefox extensions called firebug. Each of these provide you with a way to see all network traffic, including forms.
for example if you have a form where you have to submit a data and a report type the actual web request may look like
?date=2012-09-12&type=overview
so basically you would just have to find a way to make a http request to the url with that data. This is a trivial task and pretty much all languages have a way to do this.
It is very possible to do this with python. There is an abundance of tutorials out there. Python has url libraries built into the standard library that can help
http://docs.python.org/library/urllib.html
Everytime I use urllib2 I usually end up at http://www.voidspace.org.uk/python/articles/urllib2.shtml

Combining loginform and scrapy, you can automate filling forms and crawling web pages.
Here's a tutorial on it. http://blog.scrapinghub.com/2012/10/26/filling-login-forms-automatically/

Related

Visual Studio/Basic how to write to a file online

I was wondering how I could write to a file online within visual studio/basic. I want this because I have a register/login system, but i've only been able to write to the users computer. Is this possible?
One thing you could do would be to implement a web browser into your Visual Basic application. Then you can do things such as:
WebBrowser1.document.getElementById('username').innerHTML = "Donald"
Implementing this would require you to dig into the source code of the web page a little bit in order to decide what the proper names for elements are. This can be tricky for sites like Facebook who obviously try very hard to prevent people from making macros for their site. This is a valid option though and I've put this technique to use a few times.

Codeigniter and Google maps Api V 3

I am looking for some advice from somebody who has used Google maps and Codeigniter. I am new to maps and working on a project that is built with Codeigniter and uses Google Maps. I am wondering whether to incorporate it directly into the project or use a library for it.
I have found a library here - http://biostall.com/codeigniter-google-maps-v3-api-library and have started using it and have found it easy to use for incorporating maps. I am wondering however if anyone else has used it and if so does it have the full functionality of Google Maps.
I know that Google Maps has amazing features and I am a bit anxious to continue with the library and discover later in my project that it doesn't support the functionality I might need. I am going to keep researching it but if anyone has experience with it I would appreciate some advice.
Yes, I've used two different Google Maps libraries for CodeIgniter 2.
I ended up keeping the one you've linked for both projects. It was cleaner to use than the other, less helpers to load and lines to write in order to create a simple map. I don't know what else you're really looking for here. Also, with this library, I only needed to pass two variables into the View... where with the other, the View needed a bit more complex code. Really, there's only two variables that need to be passed into the view... the Map and the JavaScript for the map. If you're clever, you could also combine them into one.
Quote OP:
"I know that Google Maps has amazing features and I am a bit anxious to continue with the library and discover later in my project that it doesn't support the functionality I might need."
So what? If that's your only concern, then don't worry. Switching out something like this is pretty easy. Since it's invoked and configured in your Controller (or it should be), there's relatively little code to change.
(The developer was also very responsive to my support requests, which is saying a lot for a open source project.)

How to implement simple online management for a book library?

At my institution, we have a small library with 150 books and 50
users. We would like to use a simple online management system that
displays the books, lets users search and enter when they get and
return a book. (There is no librarian, the books are just in an
otherwise empty room.)
I'm not familiar with modern web content management systems. In the
old days, I would have just implemented a quick Perl/CGI script, but I
think there are better options nowadays?
What would be the simplest way to get/implement such a system? Django?
Ruby on Rails? Ideally, I'd like to just run it in my user account
without having to install database support etc.
Is it possible to do everything on one dynamic HTML page? What role
does AJAX play in such a system?
I suggest take a look at the available open source tools for libraries before deciding to build one from scratch:
http://www.libsuccess.org/index.php?title=Open_Source_Software#Great_Free.2FOpen_Source_Tools_for_Libraries
 
Another good resource in your research: http://www.oss4lib.org/
 
If you find an existing tool that fits the bill (or enough to make it worth extending), that will be important in guiding what platform/language/framework and techniques will be best to use.
If you want a quick and easy solution, you might want to consider using SQLite as the database backend, since it does not require any configuration or setup (except for the tables, of course).
If you have a machine standing around there, you could take a look at Qt/C++ or PyQt to create a simple user interface.
Pylons (there are lots of alternatives!) or any other web framework might do the job as well, but I guess it would be more work to create a web application than a quick and simple desktop application for this job.
This is quite a complicated question doesn't have a simple answer. The best I can do is point you in the direction of some resources to get you started:
Framework/CMS
Unfortunately, most frameworks require at least some minimal kind of db interaction. While this is not true for all, it would probably be easiest to steer clear of a framework, you probably don't need that much overhead anyway.
Javascript/AJAX
If you want things to happen without any seperate pageloads, then sure, you can use some ajax. However, you probably don't need anything this sophisiticated
How I Would Do It
If you really trusted your students enough to be diligent about checking in/out books, I think it would be easiest to just have a form on a webpage somewhere that they could enter the number of the book they are checking in/out. Then store the state of each book in a text file somewhere (you said you didn't want to use any db's), or even look into sqlite.
Again, you probably don't need all the overhead of a full framework/CMS. It would be fairly trivial to, as you said, write a quick script to handle the ISDN, ID, Title, Whatever of the book they are checking in/out.
Also, there are significantly easier languages to write scripts in these days than Perl and CGI. Try PHP, Ruby, or Java

Visual VoiceXML/VXML development tool?

Does anyone know of any tools out there that will let me run and debug a VXML application visually? There are a ton of VXML development tools, but they all require you to build your application within them.
I have an existing application that uses JSPs to generate VXML, and I'm looking for a way to navigate through and debug the rendered VXML in much the same way that Firebug allows one to do this with HTML. I have some proxy-like tools that let me inspect the rendered code as it is sent to the VXML browser, but there's a ton of JS, which makes traversing the code by hand rather difficult.
Has anyone worked with a product that allows for this?
Thanks!
IVR Avenger
There is JigSaw Test suite - has free trial license and reasonably priced.
There is IBM's debugger - part of WebSphere Voice Toolkit.
Many other products have debuggers - a very good summary is here
Disclaimer: I am the development manager for Voiyager (www.voiyager.com), a VoiceXML testing tool. It doesn't meet your criteria nor do I believe it is the type of tool you want, but I thought it was worth mentioning it.
As far as I know, there isn't such a test tool for VoiceXML. In fact there are very few VoiceXML tools on the market and hardly any of them test or analysis. The vendors that created development tools, have all been acquired by other companies. Some of them offered did offer various forms of debugging that were specific to their tool set or stayed at the Dialog (caller input) level. From your question, I'm assuming you need much lower level debugging capabilities.
I think the alternative paths are minimal and somewhat difficult. I believe your primary goal is to debug or rewrite an existing application, but you haven't provided any specific challenges beyond the JavaScript. Some thoughts or approaches that may help:
Isolate the JavaScript and place the code into a unit test harness. That will go a long way to understanding the logic of the application. Any encapsulation of the JavaScript you perform will probably go a long way towards better code maintainability.
Attempt to run the VoiceXML through a translation layer to HTML so you could use FireBug. The largest challenge would involve caller input (ie processing the SRGS grammars). You could probably cheat this by just having the form accept a JSON string the populates the field values. There are tools on the market to test grammars. Depending on the nature of your problems, you could take a simple and light approach and attempt this over just the trouble areas.
Plumb the application with a lot of logging. This can be done through the VoiceXML LOG element, or push the variable space back to the server. By adding intermediate forms, you may be able to provide a dump from each via the VoiceXML Data element.
See if your application will run in one of the open source VoiceXML browsers (not sure of the state of the open source browsers as we've built and bought for our various product lines). If you can get it mostly working, you can use the development debugger to provide some ability to step through the logic. However, it is probably one of the more difficult paths as you'll really need to understand the browser to know when and where to stick your breakpoints and to figure out how to expose the data you want.
Good luck on the challenge. If you find another approach, I would be interested in seeing it posted.
An alternative debug env is to use something like Asterisk with a voicexml browser plugin like the one from http://www.voiceglue.org/ or for a limited licence, i6net.
You can keep all the pieces separate(dynamic html and vxml application in php/jsp/j2ee/, tts processing, and optional asr processing as separate virtual machines with something like virtualbox. If the logic can be kept the same, then it is just a matter of changing the UI based on the channel.
A softphone is all you need to call a minimal asterisk machine, which has the voicexml browser with the url of the vxml in the call plan.
I just used Zend Framework as php is used in this environment, and changed view suffixes(phtml vs vxml) based on the user-agent string.
Flite for tts is fine for debugging, and when your app is ready you can either record phrases, and there was a page on the ubuntu forums with directions for how to increase flite quality with some additional sound files.
Do you have tried Eclipse VTP or InVision Studio?
Eclipse VTP
This is Eclipse plugin. But I feel that it is user-unfriendly a little (of Japanese viewpoint).
InVision Studio *Required create user account*
This is Convergys's IVR tool. It has to edit standard VXML mode. (Unfortunately, It's not exact matching.)
For just debugging vxml, I use Nuance Cafe's VoiceXML checker. It doesn't give you a visual tree or anything, but it's pretty good at spotting syntax errors and is free. I think they might also have more advanced debugging tools if you look into it, but I haven't had the need. (Note: I have no association with them)
http://cafe.bevocal.com/tools/vxmlchecker/vxmlchecker.jsp
I'm looking for the same problem that most of the links are down. I found a document where they propose an open source solution, which works as a plugin for Asterisk (https://www.researchgate.net/publication/228873959_Open_Source_VoiceXML_Interpreter_over_Asterisk_for_Use_in_IVR_Applications) and is available at https://sourceforge.net/projects/voxy/
I would like to know if there are current options to create a VXML structure graphically, like the next image.

Macro/Scripting language for non-developers with a simple GUI-based editor

We wish to provide people to be able to add some logic to their accounts(say, given a few arguments, how to compute a particular result). So, essentially, this would be tantamount to writing simple business rules with support for conditionals and expressions. However, the challenge is to provide them a simple online editor where they can create the logic (preferably) by completely visual means (drag/drop Expr-tree nodes maybe -- kinda like Y! pipes).
Does anybody know of a scripting/macro/domain-specific language that lets people do this? The challenge is the visual editor, since we don't wish to invest in developing the UI to do the editing. The basic requirements would be:
1. Embedded into another language, or run securely (no reboot -n or <JUNK-DANGEROUS-COMMAND> >> ~/.bashrc)
2. Easily accessible to users without coding background (no need of any advanced features)
3. Preferably have a simple GUI based editor to create the logic programs accessible to non-developers (kinda like spreadsheets)
4. Some ability to generate compile-time warnings (invalid code) would be good (Type safety?)
5. Ability to embed some data before execution which is available to the interpreter (Eg., name, birthday, amount)
Anybody tried doing something like this and got any ideas? I looked at Lua, Io, Python, Ruby and a host of others, but the challenge essentially is that I don't think non-programmers will be able to understand the code all that much. Something that could be added via "meta-programming" to say a Ruby would be good as well, if an editor could be easily developed!
As a matter fact, Microsoft is developing Oslo, which is right up your alley.
Chris Sells has been writing a lot about it recently.
It is designed to be a way to author DSLs and also to visually author these models with a graphical tool called Quadrant. Sounds very very similar to what you are looking for.
Open source wise, Ruby I think can be close, as you can see if you look at _whytheluckystiff's Try Ruby or Hackety.
I don't think you'll find anything that isn't too generic, especially regarding the GUI editor. There's no generic tools as far as I know that will be able to automatically interface with your program and be able to query data from it and interpret the script into commands in your software -- if there is I'd like to have a copy. Not being flippant, but you will have to do some (probably alot) of work to get this working. It will probably result in you writing a custom DSL.
I would take a look at PowerShell. You could surface all the activities a user would like to script in a very readable way.
There is some talk of using PowerShell to create a DSL on the PowerShell team blog and Bruce Payette, the technical lead, talks about this in his book Windows PowerShell in Action from Manning.
At the other end of the scale is to write something simple as a HyperText Application (HTA) -- assuming Windows of course -- along the lines of my Clive tool. The article on the blog doesn't mention the HTA version, but essentially I could enter VBScript-ish code into one textarea and interpret it on the spot, output going into another text area on the form.
With HTAs giving you all the form control of HTML, plus the DOM, you could come up with something interesting fairly quickly.

Resources