Reusing connections between threads in Ruby / replacement for Net::HTTP::Persistent - ruby

I'm running a multithreaded daemon where an instance of ruby Mechanize (which contains a Net::HTTP::Persistent object), might be used and run by one of many threads. I'm running into tons of problems because Net::HTTP::Persistent opens a new connection for each thread that runs it, so if I have 50 threads, i end up opening 50 times more connections than i need to! I've tried subclassing and patching Net::HTTP::Persistent to store its connection information as part of its class instead of in Thread.current, but then I keep getting
too many connection resets (due to Broken pipe - Errno::EPIPE)
all over the place.. any thoughts? anyone know an alternate library to Net::HTTP::Persistant I could use, and hopefully easily patch Mechanize with?

The problem is, if you access a Net::HTTP::Persistent object from another thread, and that object is in the middle of something, that thread would either have to block (stop execution and wait for the object to do what it needs to), or create a new object and mess with that. With threading, you could be in the (forgive me, I'm making assumptions here) middle of a HTTP request, when all of a sudden, another thread wants to create a HTTP request using the same connection, which breaks everything (probably why you got the connection reset issue).
If you really want threading, your options are to have however many connections open, or wait for an open connection so you can use it.

Reverted back to Mechanize 1.0.0, and that solved the problem. Persitant connections are handled in a more reliable, multithreading friendly way in 1.0, unlike in Net::HTTP::Persistent which Mechanize 2+ uses. My advice: stick with Mechanize 1.0 its more reliable, gets less errors, and DOESNT HAVE CRAZY PROBLEMS WITH MULTITHREADED CODE!!! sheesh.
Note: Unlike some of the comments may suggest, Mechanize 1.0 DOES implement persistent connections: take a look at the source code, or verify with Wireshark.

Related

race condition with redis session

I have the problem that my software has critical data stored in the session.
Since I'm using ajax and the user can simply open the software in several tabs there WILL be parallel requests.
Limiting it to one request at a time is unfortunately not possible.
My original attempt to solve this problem was to use an after_filter in my application_controller to call a method that would detect changes other workers made and merge them into its own session object before saving it.
Unfortunately that did mitigate my problem but not solve it completely.
It seems to me that between my after_filter and the middleware that actually saves my session, which is ActionDispatch::Session::RedisStore, there is still a gap big enough for another worker write his own session.
I cannot think of any other solution to close this gap but this one:
Write a class that inherits from the middleware
teach it to execute the "merge code" in get_session and set_session
replace the original middleware with my class by config.middleware.swap
Before I do this I would like to aks for opinions and advice or, ideally, a better solution. Messing with the middleware seems too dangerous to me to do without asking for advice first.
Since you said the data in session is critical, I think it's better to synchronize requests from each user, while keep concurrency between different users.
For example, you can fire up several rails processes, each listening its own port, and provide a load balancer (e.g. Nginx) above those processes.
A load balancer with session sticky feature is perfect, but IP hash is also acceptable.

How can I create a reliable and fast network daemon with Ruby?

I am trying to create a Ruby daemon process which clients will be able to connect to.
I need to ensure that the remote Ruby process always remains up and available for connection, so I need to detect network outages or unreachable errors.
I was thinking of having a heartbeat mechanism at the application level between clients and the server, and a timeout in the client if the connection fails.
I was told the select method in Ruby could be of help as well but not sure.
Can anyone share any good links/resources or impart some general wisdom to create reliable and fast daemon processes in Ruby?
I think a lot of people would use eventmachine for this type of application. At its core, it uses epoll (which is similar to select) to decide which socket to deal with next. There are lots of gems that build on eventmachine to allow you to run different types of servers. One example is em-websocket.

Boost.Asio SSL ungraceful close

I am trying to handle SSL error scenarios where, for example, SSL async_handshake() is taking too long.
After some time (say 20sec) i want to close this connection (lowest_layer().close()).
I pass shared_ptr with connection object as a parameter to async_handshake(), so object still exists, eventually handshake handler is invoked and object gets destroyed.
But, still I'm getting sporadic crashes! Looks like after close() SSL is still trying to read or operate on read buffer.
So, the basic question - is it safe to hard close() SSL connection?
Any ideas?
Typically the method I've used stop outstanding asynchronous operations on a socket is socket::cancel as described in the documentation. Their handlers will be invoked with asio::error::operation_aborted as the error parameter, which you'll need to handle somehow.
That said, I don't see a problem using close instead of cancel. Though it is difficult to offer much help or advice without some code to analyze.
Note that some Windows platforms have problems when canceling outstanding asynchronous operations. The documentation has suggestions for portable cancelation if your application needs to support Windows.

Cache an FTP connection via session variables for use via AJAX?

I'm working on a Ruby web Application that uses the Net::FTP library. One part of it allows users to interact with an FTP site via AJAX. When the user does something, and AJAX call is made, and then Ruby reconnects to the FTP server, performs an action, and outputs information.
Every time the AJAX call is made, Ruby has to reconnect to the FTP server, and that's slow. Is there a way I could cache this FTP connection? I've tried caching in the session hash, but "We're sorry, but something went wrong" is displayed, and a TCP dump is outputted in my logs whenever I attempt to store it in the session hash. I haven't tried memcache yet.
Any suggestions?
What you are trying to do is possible, but far from trivial, and Rails doesn't offer any built-in support for it. In fact you will need to descend to the OS level to get this done, and if you have more than one physical server then it will get even more complicated.
First, you can't store a connection in the session. In fact you don't want to store any Ruby object in the session for many reasons, including but not limited to:
Some types of objects have trouble being marshalled/unmarshalled
Deploying could break stuff if the model changes and people have outdates stuff serialized in their session
If you are using the cookie session store then you only have 4k
So in general, you only ever want to put primitives like strings, numbers and booleans into the session.
Now as far as an FTP connection is concerned, this falls into the category of things that can't be serialized/unserialized reliably. The reason is because it's not just a Ruby object, but also has a socket open which is going to be closed as soon as the original object is garbage collected.
So, in order to keep a FTP connection persistent, it can't be stored in a controller instance variable because the controller instance is per-request. You could try to instantiate it it somewhere outside the controller instance, but that has the potential for memory leaks if you are not very careful to clean up the connections, and besides, if you have more than one app server instance then you would also need to find a way to guarantee that the user talks to the same app server instance on each request, or it wouldn't be able to find the hook. So all in all, keeping the session open in the Ruby process is a non-starter.
What you need to do is open the connection in a separate process that any of the ruby processes can talk to. There's really no established and standard way to do that, you'll have to roll your own. You could look into DRb to provide some of the primitives you will need.
AJAX can't directly talk to FTP. It's designed for HTTP. That doesn't stop you from writing something that does cache the FTP server though. You probably should profile it to find out what's really slow. My guess is that the FTP access is just slow. Caching it may be a mixed blessing though. How do you know when the content of the ftp site changes?

Is there an alternative of ajax that does not require polling without server side modifications?

I'm trying to create a small and basic "ajax" based multiplayer game. Coordinates of objects are being given by a PHP "handler". This handler.php file is being polled every 200MS, by using ajax.
Since there is no need to poll when nothing happens, I wonder, is there something that could do the same thing without frequent polling? Eg. Comet, though I heard that you need to configure server side applications for Comet. It's a shared webserver, so I can't do that.
Maybe prevent the handler.php file from even returning a response if nothing has to be changed at the client, is that possible? Then again you'd still have the client uselessly asking for a response even though something hasn't changed yet. Basically, it should only use bandwidth and sever resources if something needs to be told to the client, eg. the change of an object's coordinates.
Comet is generally used for this kind of thing, and it can be a fragile setup as it's not a particularly common technology so it can be easy not to "get it right." That said, there are more resources available now than when I last tried it ~2 years ago.
I don't think you can do what you're thinking and have handler.php simply not return anything and stop execution: The web server will keep the connection open and prevent any further polling until handler.php does something (terminates or provides output). When it does, you're still handling a response.
You can try a long polling technique, where your AJAX allows a very large timeout (e.g. 30 seconds), and handler.php spins without responding until it has something to report, then returns. (You'll want to make sure the spinning is not resource-intensive). If handler.php "expires" and nothing happens, have it exit and let AJAX poll again. Since it only happens every 30 seconds, it will be a huge improvement over ~5 times a second. That would keep your polling to a minimum.
But that's the sort of thing Comet is designed for.
As Ajax only offers you a client server request model (normally termed pull, rather than push), the only way to get data from the server is via requests. However a common technique to get around this is for the server to only respond when it has new data. So the client makes a request, the server hangs on to that request until something happens and then replies. This gets around the need for frequent polling even when the data hasn't changed as you only need the client send a new request after it gets a response.
Since you are using PHP, one simple method might be to have the PHP code call the sleep command for 200ms at a time between checks for data changes and then return the data to the client when it does change.
EDIT: I would also recommend having a timeout on the request. So if nothing happens for say 2 seconds, a "no change" message is sent back. That way the client knows the server is still alive and processing its request.
Since this is tagged “html5”: HTML5 has <eventsource> and WebSocket, but the implementation side is still in the future tense in practice.
Opera implemented an old version of <eventsource> called <event-source>.
Here's a solution - use a SaaS comet provider, such as WebSync On-Demand. No server resources to worry about, shared hosting or not, since it's all offloaded, and you can push out the information as needed.
Since it's SaaS, it'll work with any server language. For PHP, there's already a publisher written and ready to go.
The server must take part in this. Check with the hosting provider what modules are available. Or try to convince them to support Comet.
Maybe you should consider a small Virtual Private Server (VPS) for this.
One thing to add on the long polling suggestions: If you're on a shared server, this solution will have limited scalability, as each active long poll will keep a connection (and a server-side process to service that connection) active. Your provider most likely has limits (either policy-defined or de facto) on the number of connections you can have open at a time, so you'll hit a wall if you have more sessions/windows than that playing concurrently.

Resources