clients managment with multi SignalR servers behind a load balancer - websocket

Let's say I've got 2 Stock ticker servers that're pushing quotes to web browser clients. the 2 servers sits behind a load balancer (Round Robin mode).
consider the following scenario:
client A subscribe to Google stock on Server1 like so: Groups.Add(Context.ConnectionId, "Google");
client B subscribe to Yahoo stock on Server2:Groups.Add(Context.ConnectionId, "Yahoo");
client C subscribe to Google stock on Server2:Groups.Add(Context.ConnectionId, "Google");
Now both servers are already synced with the stock market so when a stock gets updated they both get the update at real time.
my question is:
when server2 push a new update like so:
Clients.Group("Google").tick(quote);
who are the clients it will send the message to? will it always be client C? I guess not, we have a load balancer in between so the connected clients at a given time may change, right? it may me C now, but next tick it can be clients A&C or only A. A web sockets's connection suppose to stay open so how does the load balancer will handle that, will it always forward the connection from 1 client to a specific server?
backplane won't help me here, because my 2 servers already synced and will send the same messages at the same time. so if I'll force them to route their messages through the backplane to the other server it will end up with duplicate messages to the clients like so:
server1 gets ticker X to Google at 10:00 --> route to backplane --> route to server 2
server2 gets ticker X to Google at 10:00--> route to backplane --> route to server 1
server 1 sends 2 X Google tickers to his clients
server 2 sends 2 X Google tickers to his clients

OK, eventually I have synced all group subscriptions thorugh a shared cache (Redis) so all servers knows all users and their subsciptions. this way each server will know his current clients registerd groups, and will push the relevant data.
Update:
After much thought this is what we've ended up doing:
load balancer will assign a sticky session to an incoming connection so each new connection will have a one constant SignalR server.
section 1 will make the Redis sync redundant as each server will know all his clients.
In case of a server\network failure, the SignalR client will reconnect and will be assigned with a new(in case of a server failure) server by the load balancer.
After a reconnect the SignalR client will resubscribe with the relevant stocks(May be redundant if the failure was on the network and the load balancer redirect it to the old SignalR server, but I'll live with that).

Related

How does AWS Application Load balancer select a target within a target group? How to load balance the websocket traffic?

I have an AWS Application load balancer to distribute the http(s) traffic.
Problem 1:
Suppose I have a target group with 2 EC2 instances: micro and xlarge. Obviously they can handle different traffic levels. Does the load balancer manage traffic proportionally to instance sizes or just round robin? If only round robin is used and no other factors taken into account, then it's not really balancing load, because at some point the micro instance will be suffering from the traffic, while xlarge will starve.
Problem 2:
Suppose I have target group with 2 EC2 instances, both are same size. But my service is not using a classic http request/response flow. It is using HTTP websockets, i.e. a client makes HTTP request just once, to establish a socket, and then keeps the socket open for longer time, sending and receiving messages (e.g. a chat service). Let's suppose my load balancer is using round robin and both EC2 instances have 1000 clients connected each. Now suppose one of the EC2 instances goes down and 1000 connected clients drop their socket connections. The instance gets back up quickly and is ready to accept websocket calls again. The 1000 clients who dropped are trying to reconnect. Now, if the load balancer would use pure round robin, I'll end up with 1500 clients connected to instance #1 and 500 clients connected to instance #2, thus not really balancing the load correctly.
Basically, I'm trying to find out if some more advanced logic is being used to select a target in a group, or is it just a naive round robin selection. If it's round robin only, then how can I really balance the websocket connections load?
Websockets start out as http or https connections, so a load balancer can dispatch them to a server. Once the server accepts the http connection, both the server and the client "upgrade" the connection to use the websocket protocol. They then leave the connection open to use for websocket traffic. As far as the load balancer can tell, the connection is simply a long-lasting http connection.
Taking a server down when it has websocket connections to clients requires your application to retry lost connections. Reconnecting on connection failure is one of the trickiest parts of websocket client programming. Your application cannot be robust without reconnect logic.
AWS's load balancer has no built-in knowledge of the capabilities of the servers behind it. You have observed that it sends requests equally to big and small servers. That can overwhelm the small ones.
I have managed this by building a /healthcheck endpoint in my servers. It's a straightforward https://example.com/heathcheck web page. You can put a little bit of content on the page announcing how many websocket connections are currently open, or anything else. Don't password protect it or require a session to hit it.
My /healthcheck endpoints, whenever hit, measure the server load. I simply use the number of current websocket connections, but you can use any metric you want. I compare the current load to a load threshold configured for each server. For example, on a micro instance I can handle 20 open websockets, and on a production instance I can handle 400.
If the server load is too high, my endpoint gives back a 503 http error status along with its content. 503 typically means "I am overloaded, please try again later." It can also mean "I will shut down when all my connections are closed. Please don't use me for any more connections."
Then I configure the load balancer to perform those health checks every couple of minutes on all the servers in the server pool (AWS calls the pool a "target group"). The health check operation detects "unhealthy" servers and temporarily takes them out of its rotation. (The health check also detects crashed servers, which is good.)
You need this loadbalancer health check for a large-scale production setup.
All that being said, you will get best results if all your server instances in your pool have roughly the same capacity as each other.

How to deal with WebSocket on multiple servers?

I have WebSocket implemented in a real-time application, where connected clients get all server updates without page refresh. That's fine and it's working very well. The problem is as follows:
Lets say I use two servers (server1 and server2) to serve client requests. If a client on server1 updates the database, all clients connected to server1 will get the updates, as expected, because server1 is aware of all connected clients. However, clients connected to server2 do not get any updates because they are being served by server2 who is not aware of the database updates (the updates were done by a client on server1)!
Is there a standard way of handling this? Also assume I have many servers.
If this has been addressed before, I'd also appreciate a pointer to it. Thanks
Handling the DB values, changes should be the responsibility of each instance connected to the DB. Whereas sharing updates (requires DB change or not)across various clients should be the responsibility of the handler. For websocket usually such updates are handles by writing it to a pub/sub channel/queue such as reddis and all instances subscribed to appropriate channel. Whenever any instance wants all clients to receive an update it puts it on that queue and all the instances are able to receive and broadcast it

Do I need session-clustering on a DB for load balancing a Jetty WebSockets server with HAProxy on AWS/EC2?

I am writing a chat-like application using WebSockets using a Jetty 9.3.7 WebSockets server running on AWS/EC2. A description of the architecture is below:
(a) The servers are based on HTTPS (wss). I am thinking of using HAProxy using IP hash-based LB for this. The system architecture will look like this:
-->wss1: WebSocket server 1
/
clients->{HAProxy LB} -->wss2: WebSocket server 2
(a, b,..z) \
-->wss3: WebSocket server 3
I am terminating HTTPS/wss on the LB per these instructions.
(b) Clients a...z connect to the system and will connect variously to wss1, wss2 or wss3 etc.
(c) Now, my WebSocket application works as follows. When one of the clients pushes a message, it is sent to the WS server the client is connected to (say wss1, and then that message is disseminated to a few of the other clients (the set of clients being programmatically determined at my WebSocket application running on wss1). E.g., a creates a message Hey guys! and pushes it to wss1, which is then pushed to clients b and c so that b and c receive Hey guys! message. b has a WebSocket connection to server wss2 and c has a WebSocket connection to wss3.
My question is, to push the message from the message receiving server, like (c) above, wss1 needs to know the WebSocket session/connection to b and c which may well be on a different WebSocket server. Can I use session clustering on Jetty to retrieve the sessions b and c are connected to? If not, what's the best way to provide this lookup while load balancing Jetty WebSockets?
Second, if I do use session clustering or some such method to retrieve the session, how can I use the sessions for b and c on wss1 to send the message to b and c? It appears like there is no way to do this except with some sort of communication between the servers. Is this correct?
If I have to use session clustering for this, is there a github example you can point me to?
Thanks!
I think session clustering is not a right tool. Message Oriented Middleware (MOM) supporting publish and subscribe model should be enough to cluster multiple real-time applications. As an author of Cettia, a real-time application framework, I've used publish and subscribe model to scale application horizontally.
The basic idea is
A message to be exchanged through MOM is an operation applying to each server locally. For example, operation can be 'sending a message to all clients'. Here all clients means ones that connect to server to execute a given operation.
Every server subscribe the same topic of MOM. When some message is given by the topic, server deserializes it into operation and executes the operation locally. It happens on every server.
If some operation happens on some server, that server should serialize it into message and publish it to the topic.
With Cettia, all you need is to plug your MOM into cettia application. If you want to make it from scratch, you need to implement the above ideas.
http://cettia.io/projects/cettia-java-server/1.0.0-Beta1/reference/#clustering
https://github.com/cettia/cettia-java-server/blob/1.0.0-Beta1/server/src/main/java/io/cettia/ClusteredServer.java
Here's working examples per some MOMs. Though they are examples written in Cettia, it might help you understand how the above idea works.
AMQP 1
Hazelcast 3
jGroups 3
JMS 2
Redis 2
Vert.x 2

One application on multiple servers

I am developing a web based application that needs to use websockets so the users will be able to see the updates in real time.
However there is something that disturbs me. If there are too many clients simultaniously using the application there needs to be a second server running the same application so some of the users will be redirected to it.
Then how do i make the updates which happen on one of the both servers to be seen on the other one? Do i need to program a tcp connection between both of them and message each other when some update happens???
If your users are connected to both servers (e.g. some users connected to one server and some users connected to another server) and you want to broadcast a message to all connected users from one of the servers, then YES you will need to have the server originating the messasge tell the other server to broadcast a message to all of its connected users. So, YES, the two servers will have to be connected so they can exchange these update commands.
If you had N servers (perhaps where N was even variable over time), then you would probably designate one master server that kept a connection to all the other servers. Then, when any notification was going to be sent to all connected users, a server would simply notify the master server who would then notify all the servers who would then broadcast to all their users. When each server starts up, it just connects to the one master server and that's all it has to now about.
Of course, if you only have two servers, you don't need the master server concept. Each can just connect to the other.

Is it possible to communicate with http requests between web and worker processes on Heroku?

I'm building an HTTP -> IRC proxy, it receives messages via an HTTP request and should then connect to an IRC server and post them to a channel (chat room).
This is all fairly straightforward, the one issue I have is that a connection to an IRC server is a persistent socket that should ideally be kept open for a reasonable period of time - unlike HTTP requests where a socket is opened and closed for each request (not always true I know). The implication of this is that a message bound for the same IRC server/room must always be sent via the same process (the one that holds a connection to the IRC server).
So I basically need to receive the HTTP request on my web processes, and then have them figure out which specific worker process has an open connection to the IRC server and route the message to that process.
I would prefer to avoid the complexity of a message queue within the IRC proxy app, as we already have one sitting in front of it that sends it the HTTP requests in the first place.
With that in mind my ideal solution is to have a shared datastore between the web and worker processes, and to have the worker processes maintain a table of all the IRC servers they're connected to. When a web process receives an HTTP request it could then look up the table to figure out if there is already a worker with a connection the the required IRC server and forward the message to that, or if there is no existing connection it could effectively act as a load balancer and pick an appropriate worker to forward the message to so it can establish and hold a connection to the IRC server.
Now to do this it would require my worker processes to be able to start an HTTP server and listen for requests from the web processes. On Heroku I know only web processes are added to the public facing "routing mesh" which is fine, what I would like to know is is it possible to send HTTP requests between a web and worker process internally within Herokus network (outside of the "routing mesh").
I will use a message queue if I must be as I said I'd like to avoid it.
Thanks!

Resources