Best ruby binding/gem for curl/libcurl - ruby

I want to use the curl tool through ruby. So far I have invoked curl through the command line using curl and then parsing the data dumped from a file. However, I would like to use it from within my application. That would give me better control over the handling etc.
There are few gems out there http://curb.rubyforge.org/ and http://curl-multi.rubyforge.org/ However it's not clear which one is the best to use. I have the following criteria for decision
Stability and reliability of the library
Comprehensive support of underlying curl features. (I would be needing data posting, forging HTTP headers, redirects and multi-thread requests heavily.)
It would be great to get some feedback.
Thanks for your help.
-Pulkit

I highly recommend Typhoeus. It relies on lib-curl, and allows for all sorts of parallel and async possibilities. It offers ssl, stubbing, follows redirects, allows custom headers, true parallel requests for blazing speed, and generally has yet to let me down. Also, it is well maintained--at the moment, the last commit was 2 days ago!

Related

How can I test/send multiple (fake) ajax-requests at once to a (node.js) server?

At a certain point your (node.js) app works good with your single requests, and you would like to see what happens if fifty people use it at the same time. What will happen to the memory usage? What happens to the overall speed of the response?
I reckon this kind of testing is done a lot, so I was thinking there might be a relatively easy helper program for that.
By relatively easy I mean something convenient like POSTman - REST client is for single request and response testing.
What is your recommended (or favorite) method of testing this?
We use http://jmeter.apache.org/ , free and powerful ... you can set test use cases and run them

running parallelHttp requests using typhoeus with Hydra in ruby

I was going through http://typhoeus.github.com/articles/getting_started.html#making_parallel_requests
and I couldn't actually understand how does typhoeus with Hydra achieve in making parallel HTTP requests possible. Is it similar to EventMachine::Iterator and EvenMachine::HTTPRequest handle concurrent requests? I am planning to go through its source code, but if anyone already knows what is going on at the back end please enlighten me. It will help me a lot to understand Typhoeus better.
Thanks!
Typhoeus is a libcurl wrapper and doesn't do parallel requests itself. But is provides an interface to libcurls multi: http://curl.haxx.se/libcurl/c/libcurl-multi.html which takes care of doing parallel requests. That makes it different from Eventmachine, because libcurl does the heavy lifting so you don't have to worry about your ruby code.
To be even more precise, Typhoeus(since 0.5.0.alpha) uses Ethon: https://github.com/typhoeus/ethon instead dealing with libcurl on its own. If you want to see how Ethon works with libcurls multi, this is a good starting point: https://github.com/typhoeus/ethon/blob/master/lib/ethon/multi.rb.
In case you want to know whats really going on, you should look into libcurl itself.

Client bandwidth usage with Ruby's net/http

I am trying to track the bandwidth usage of individual requests in ruby, to see how much of my network usage is being split between different API calls.
Nothing I can find in net/http or ruby socket classes (TCPSocket, et. al) that seems to have a decent way to do this with out much monkey patching.
I have found a number of useful tools on linux for doing this, but none of them give me the granularity to inspect inside the http requests at the headers (so I could figure out which url we are requesting). The tools I am using are vnStat and ipfm -- which are great system bandwidth or host/network monitoring.
Ideally I would like to do something within the ruby code to track the data sent/received. I think if I could just get the raw header and add that length to the body length for both transfer and receive that would be Good Enoughâ„¢.
Can you use New Relic for Ruby? It has great visualizations/graphs for network usage/api calls.
It seems like it would be pretty easy to write a middleware to track this using Faraday.

proxy for scale, performance (to load external content)?

I am sure answer for this question will be very subjective, I simply want to know what the options are out there (for building a proxy to load external contents).
Typically I used cURL in php and pass a variable like proxy.url to fetch content. Then make an AJAX call with Javascript to populate the contents.
EDIT:
YQL (Yahoo Query language) seems a very promising solution to me, however, it has a daily usage limit which essentially prevents me from using it for large scale projects.
What other options do I have? I am open to any language, any platform, key criteria are: performance and scalability.
Please share your ideas, thoughts and experience on this topic.
Thanks,
you dont need a proxy server or something else.
Just create a cronjob to fetch the contents every 5 minutes (or whenever you want).
You just need to create a script that grabs the content from the web and saves it (to a file, a database, ...), which will be started by the cronjob.
If somebody requests your page, you just need to send the cached content out and do with it whatever you want to do.
I think scalability and performance will be no problem.
Depending on what you need to do with the content, you might consider Erlang. It's lightening fast, ridiculously reliable, and great for scaling.

Anything wrong with moving CLI validation/logic server-side?

I have a client/server application. One of the clients is a CLI. The CLI performs some basic validation then makes SOAP requests to a server. The response is interpreted and relevant information presented to the user. Every command involved a request to a web service.
Every time services are modified server-side, a new CLI needs to released.
What I'm wondering is if there would be anything wrong with making my CLI incredibly thin. All it would do is send the command string to the server where it would be validated, interpreted and a response string returned.
(Even TAB completion could be done with the server's cooperation.)
I feel in my case this would simplify development and reduce maintenance work.
Are there pitfalls I am overlooking?
UPDATE
Scalability issues are not a high priority.
I think this is really just a matter of taste. The validation has to happen somewhere; you're just trading off complexity in your client for the same amount of complexity in your software. That's not necessarily a bad thing for your architecture; you're really just providing an additional service that gives callers an alternate means of accessing your existing services. The only pitfall I'd look out for is code duplication; if you find that your CLI validation is doing the same things as some of your services (parsing numbers, for example), then refactor to avoid the duplication.
in general, you'd be okay, but client-side validation is a good way to reduce your workload if bad requests can be rejected early.
What I'm wondering is if there would be anything wrong with making my CLI incredibly thin.
...
I feel in my case this would simplify development and reduce maintenance work.
People have been doing this for years using telnet/SSH for remoting a CLI that runs on the server. If all the intelligence must be on the server anyway, there might be no reason to have your CLI be a distributed client with intelligence. Just have it be a terminal session - if you can get away with using SSH, that's what I'd do - then the client piece is done once (or possibly just an off-the-shelf bit of software) and all the maintenance and upgrades happen on the server (welcome to 1978).
Of course this only really applies if there really is no requirement for the client to be intelligent (which sounds like the case in your situation).
Using name / value pairs in a request string is actually pretty prevalant. However, at that point, why bother with SOAP at all? Instead just move to a RESTful architecture?

Resources