Network programming with ruby - ruby

My homework is to use ruby to build a simple spider(or crawler).
In ruby,to fetch web page what build-in moduel should I use?
I known python has a urllib moduel!

You could take a look at the Net::HTTP module or search for something that builds on top of it.

Related

What methods exist to auto-generate ruby client stubs from WSDL files?

I'm using Ruby and the Savon gem to interact with SOAP/WS and would like to auto-generate the client request methods from the WSDL in Ruby.
Before I do this, I'd like to know if there's any other Ruby/SOAP library that does this?
Edit: Please note, I already know this isn't available in Savon out the box, in fact my intention is to add in the feature, I'm in the process checking if this exists somewhere else written in Ruby.
Since it's only few days since you asked this question, and I've run into same problem I've decided to create small script to do that.
Download - save as objects.rb for example and run with _bunde exec objects.rb path_to.wsdl_
https://gist.github.com/4622792
Let me know if it works ^^
Take a look at Savon's spec, it has pretty rich testing environment
I think ads_common by Google is relevant to you.
google-api-ads-ruby/ads_common at master ยท googleads/google-api-ads-ruby
rake generate can create the client libraries automatically from WSDL.
It is specialized for Google Ads, but this notion would be helpful to create a versatile client library automatically from WSDL in Ruby.

How do I test a Curl based FaceBook API implementation?

I wrote my own FaceBook library that uses actual Curl requests, not libcurl.
Is there a way to test it? I'm asking this because most solutions involve using something like fakeweb which as far as I can tell will not work here.
The existing code can be found on my github page.
One approach would be to use a different host/port in test mode (eg localhost:12345)
Then in your test run a sinatra or webrick servlet on that port that you configure to respond to the requests your code should be making
You could mock Request.dispatcher with an expected behavior, pretty much like Fakeweb would do.
There are a few examples on this file, specially https://github.com/chrisk/fakeweb/blob/master/lib/fake_web/ext/net_http.rb#L44.
When running your tests/specs, monkey-patch the run method of your Request class to hook into the Marston VCR library. See the existing library_hooks subdir for examples and ideas on how to do this -- the fakeweb implementation is a good place to start.
VCR works well with live services like Facebook's because it captures interactions "as is", and VCRs can be easily re-recorded when the services change.
I'm running into problems with your library, however. You need to require the cgi and json libraries; it also looks like it requires a Rails environment (it's failing to find with_indifferent_access on Hash).

Ruby: How to screen-scrape the result of an Ajax request

I have written a ruby script to screen scrape something using the 'open-uri' and 'hpricot' gems - everything works great so far.
But now I have to screen scrape something which is returned after a form is submitted via a javascript function (called by an 'onchange' event handler from a drop-down menu):
function submit_form() {
document.list.action="/some/sort/of/path";
document.list.submit();
}
AFAIK, open-uri lets you submit only GET requests. And if I'm not mistaken, a POST request would be needed here.
So my question is: what do I need to install and to 'require' and how would the ruby code then look like (to make that POST request) - sorry, I'm still pretty much of a n00b...
Thank you very much for your help!
Tom
I think you definitely should use Mechanize. It provides a nifty interface to interact with remote pages, forms on them, and so forth (see this example).
The Ruby standard library has the http class, which naturally supports the POST operation.
Net::HTTP.post_form(URI.parse('http://www.example.com/some/sort/of/path')
If you find the API there less than optimal, then take a look at the httparty gem
Finally, while hpricot is a great gem, it isn't actively developed any longer. You should consider moving to nokogiri which practically replaces hpricot and improves upon it.

Easy Ruby http/curl API to program with

I've been using Ruby for quite some time now, however unlike PHP, as far as I know there is not a standard http/Curl (fetching, processing forms) like library that is easy and powerful like PHP's libCuRL binding.
While Net::HTTP is part of the Ruby standard library, I always find that API hard to remember and program with.
Can anyone give suggestions on which http/curl library I should use over Net::HTTP?
Take a look at HTTParty or REST Client.
I would recommend using the Typhoeus gem. It's got a pretty clean API and allows you to make concurrent requests.
I'll second Ryan's recommendation for Typhoeus, and recommend HTTPClient also. Both are very full featured and handle parallel
requests easily.
For simple requests it's hard to beat Open-URI for its simplicity:
require 'open-uri'
html = open('http://www.example.com').read
If you're parsing a page it works great with Nokogiri:
require 'open-uri'
require 'nokogiri'
doc = Nokogiri::HTML(open('http://www.example.com'))
I wrote a wrapper for the Net:HTTP lib recently, its very very simplistic. I wanted something with a simple API that was easy to use and remember, it's been working well for me:
https://github.com/ctcherry/plain_http

File downloader written in ruby

Which APIs are necessary to make a file downloader in ruby programming language?
The namespace Net::HTTP contains every usefull tools for HTTP requests in ruby.
The documentation isn't very clear but it's very useful :
http://ruby-doc.org/stdlib/libdoc/net/http/rdoc/index.html
I found this example on google :
http://snippets.dzone.com/posts/show/2469
Net::HTTP always done the job for me, i hope it's will be the same for you.

Resources