how to terminate inactive websocket connection in passenger - passenger

Past few days we are struggling with inactive websocket connections. The problem may lay on network level. I would like to ask if there is any switch/configuration option to set timeout for websocket connection for Pushion Passenger in standalone mode.

You should probably solve this at the application level, because solving it in other layers will be more ugly (less knowledge about websocket).
With Passenger Standalone you could try to set max_requests. This should cause application processes to be restarted semi-regularly, and when shutting down a process Passenger should normally abort websocket connections.
If you want more control over the restart period you could also use for example a cron job that executes rolling restarts every so often, which shouldn't be noticeable to users either.

Websockets in Ruby and Passenger (and maybe node.js as well) aren't "native" to the server. Instead, your application "hijacks" the socket and controls all the details (timeouts, parsing etc').
This means that a solution must be implemented in the application layer (or whatever framework you're using), since Passenger doesn't keep any information about the socket any more.
I know this isn't the answer you wanted, but it's the fact of the matter.
Some approaches use native websockets where the server controls websocket connections (timeouts, parsing etc', i.e. the Ruby MRI iodine server), but mostly websockets are "hijacked" from the server and the application takes full control and ownership of the connections.

Related

Pinging client with Websocket server

I have a Websocket connection being served from http-kit (Clojure, and it works great). I send pings from the client to make sure we're still connected, and everything works fine there. My question is, do people bother pinging client from server in these cases?
I was trying to set something up to remove the channel from the server if I didn't get a response, but it's not very functional-friendly to set up timed processes and alter state to track the ping-pong cycle, so it was getting a little ugly. Then I thought, the server can handle hundreds of thousands of simultaneous connections, should I just not worry about a few broken threads? How do people typically handle (or not handle) this?
The WebSocket protocol itself has heart beating to keep the connection alive. If you wanted an additional layer on top of that you could use the STOMP protocol, which coordinates heartbeats between client/server.
The one STOMP implementation I know of for the JVM is Stampy. There’s one for JS too, stompjs. Note: the heartbeat implementation differs between these libs, I believe the Stampy one is incorrect. You’d have to roll your own.

nodeJS being bombarded with reconnections after restart

We have a node instance that has about 2500 client socket connections, everything runs fine except occasionally then something happens to the service (restart or failover event in azure), when the node instances comes back up and all socket connections try to reconnect the service comes to a halt and the log just shows repeated socket connect/disconnects. Even if we stop the service and start it the same thing happens, we currently send out a package to our on premise servers to kill the users chrome sessions then everything works fine as users begin logging in again. We have the clients currently connecting with 'forceNew' and force web sockets only and not the default long polling than upgrade. Any one ever see this or have ideas?
In your socket.io client code, you can force the reconnects to be spread out in time more. The two configuration variables that appear to be most relevant here are:
reconnectionDelay
Determines how long socket.io will initially wait before attempting a reconnect (it should back off from there if the server is down awhile). You can increase this to make it less likely they are all trying to reconnect at the same time.
randomizationFactor
This is a number between 0 and 1.0 and defaults to 0.5. It determines how much the above delay is randomly modified to try to make client reconnects be more random and not all at the same time. You can increase this value to increase the randomness of the reconnect timing.
See client doc here for more details.
You may also want to explore your server configuration to see if it is as scalable as possible with moderate numbers of incoming socket requests. While nobody expects a server to be able to handle 2500 simultaneous connections all at once, the server should be able to queue up these connection requests and serve them as it gets time without immediately failing any incoming connection that can't immediately be handled. There is a desirable middle ground of some number of connections held in a queue (usually controllable by server-side TCP configuration parameters) and then when the queue gets too large connections are failed immediately and then socket.io should back-off and try again a little later. Adjusting the above variables will tell it to wait longer before retrying.
Also, I'm curious why you are using forceNew. That does not seem like it would help you. Forcing webSockets only (no initial polling) is a good thing.

How to fix "sequel::DatabaseDisconnectError - Mysql::Error: MySQL server has gone away" on Heroku

I have a simple Sinatra application hosted on Heroku and using Sequel to connect to a MySql database through the ClearDB addon.
The application works fine, except when it sits idle for more than a minute. In that case, the first request I make gives a "500 Internal Server Error", which heroku logs reveals to be:
sequel::DatabaseDisconnectError - Mysql::Error: MySQL server has gone away
If I refresh the page after this error, it works fine, and the error will not return until the application sits idle for another minute or so.
The application is running two dynos, so the problem is not being caused by the Heroku dyno idling you might see on a free account. I contacted ClearDB support, and they gave me this advice:
if you are using connection pooling, then you should set the idle
timeout at just below 60 seconds and/or set a keep-alive as I
mentioned below. If you are not using connection pooling, then you
must make sure that the app actually closes connections after queries
and doesn't rely on the network timeout to shut them down.
I understand that I could create a cron job to hit the server every 30s or so, but that seems an inelegant solution to the problem. The other suggestion about making sure the application closes connections I don't understand. I'm just using Sequel to make queries, I assumed that Sequel manages the connections for me under the hood. Do I need to configure it to ensure that it closes connections? How would I do that?
Your connection times out which is no big deal. Sequel can deal with that situation if you add the connection_validator extension to your DB:
DB.extension(:connection_validator)
As described in the documentation this extension
"detects an invalid connection, […] removes it from the pool and tries the next available connection, creating a new connection if no available connection is valid"

How to deploy application on servers running websockets without terminating client sessions?

I am working on designing a large scale web application that will heavily use WebSockets to communicate data between the client browser and our servers. The question that I have is how we are going to push application updates on our servers without our clients knowing that this is being done. We want to have as mush as 100% up time as possible.
The problem is with the WebSockets becoming closed if there is an application update being done on it. Is there a way to hand off an existing connection to another application server?
I was considering to implement some client side script that would automatically re-connect to the server if it was closed prematurely. However I would like to know from you smart people what other ideas that I can consider.
Thanks!

OpenFire, HTTP-BIND and performance

I'm looking into getting an openfire server started and setting up a strophe.js client to connect to it. My concern is that using http-bind might be costly in terms of performance versus making a straight on XMPP connection.
Can anyone tell me whether my concern is relevant or not? And if so, to what extend?
The alternative would be to use a flash proxy for all communication with OpenFire.
Thank you
BOSH is more verbose than normal XMPP, especially when idle. An idle BOSH connection might be about 2 HTTP requests per minute, while a normal connection can sit idle for hours or even days without sending a single packet (in theory, in practice you'll have pings and keepalives to combat NATs and broken firewalls).
But, the only real way to know is to benchmark. Depending on your use case, and what your clients are (will be) doing, the difference might be negligible, or not.
Basics:
Socket - zero overhead.
HTTP - requests even on IDLE session.
I doubt that you will have 1M users at once, but if you are aiming for it, then conection-less protocol like http will be much better, as I'm not sure that any OS can support that kind of connected socket volume.
Also, you can tie your OpenFires together, form a farm, and you'll have nice scalability there.
we used Openfire and BOSH with about 400 concurrent users in the same MUC Channel.
What we noticed is that Openfire leaks memory. We had about 1.5-2 GB of memory used and got constant out of memory exceptions.
Also the BOSH Implementation of Openfire is pretty bad. We switched then to punjab which was better but couldn't solve the openfire issue.
We're now using ejabberd with their built-in http-bind implementation and it scales pretty well. Load on the server having the ejabberd running is nearly 0.
At the moment we face the problem that our 5 webservers which we use to handle the chat load are sometimes overloaded at about 200 connected Users.
I'm trying to use websockets now but it seems that it doesn't work yet.
Maybe redirecting the http-bind not via Apache rewrite rule but directly on a loadbalancer/proxy would solve the issue but I couldn't find a way on how to do this atm.
Hope this helps.
I ended up using node.js and http://code.google.com/p/node-xmpp-bosh as I faced some difficulties to connect directly to Openfire via BOSH.
I have a production site running with node.js configured to proxy all BOSH requests and it works like a charm (around 50 concurrent users). The only downside so far: in the Openfire admin console you will not see the actual IP address of the connected clients, only the local server address will show up as Openfire get's the connection from the node.js server.

Resources