I have a web application that makes 100 AJAX requests per second to the server from 1 client. Now the issue is my server is overloaded and I am currently load balancing the server with high cost. So can I use web sockets instead of AJAX requests will that reduce the server load?
Related
I have an AWS Application load balancer to distribute the http(s) traffic.
Problem 1:
Suppose I have a target group with 2 EC2 instances: micro and xlarge. Obviously they can handle different traffic levels. Does the load balancer manage traffic proportionally to instance sizes or just round robin? If only round robin is used and no other factors taken into account, then it's not really balancing load, because at some point the micro instance will be suffering from the traffic, while xlarge will starve.
Problem 2:
Suppose I have target group with 2 EC2 instances, both are same size. But my service is not using a classic http request/response flow. It is using HTTP websockets, i.e. a client makes HTTP request just once, to establish a socket, and then keeps the socket open for longer time, sending and receiving messages (e.g. a chat service). Let's suppose my load balancer is using round robin and both EC2 instances have 1000 clients connected each. Now suppose one of the EC2 instances goes down and 1000 connected clients drop their socket connections. The instance gets back up quickly and is ready to accept websocket calls again. The 1000 clients who dropped are trying to reconnect. Now, if the load balancer would use pure round robin, I'll end up with 1500 clients connected to instance #1 and 500 clients connected to instance #2, thus not really balancing the load correctly.
Basically, I'm trying to find out if some more advanced logic is being used to select a target in a group, or is it just a naive round robin selection. If it's round robin only, then how can I really balance the websocket connections load?
Websockets start out as http or https connections, so a load balancer can dispatch them to a server. Once the server accepts the http connection, both the server and the client "upgrade" the connection to use the websocket protocol. They then leave the connection open to use for websocket traffic. As far as the load balancer can tell, the connection is simply a long-lasting http connection.
Taking a server down when it has websocket connections to clients requires your application to retry lost connections. Reconnecting on connection failure is one of the trickiest parts of websocket client programming. Your application cannot be robust without reconnect logic.
AWS's load balancer has no built-in knowledge of the capabilities of the servers behind it. You have observed that it sends requests equally to big and small servers. That can overwhelm the small ones.
I have managed this by building a /healthcheck endpoint in my servers. It's a straightforward https://example.com/heathcheck web page. You can put a little bit of content on the page announcing how many websocket connections are currently open, or anything else. Don't password protect it or require a session to hit it.
My /healthcheck endpoints, whenever hit, measure the server load. I simply use the number of current websocket connections, but you can use any metric you want. I compare the current load to a load threshold configured for each server. For example, on a micro instance I can handle 20 open websockets, and on a production instance I can handle 400.
If the server load is too high, my endpoint gives back a 503 http error status along with its content. 503 typically means "I am overloaded, please try again later." It can also mean "I will shut down when all my connections are closed. Please don't use me for any more connections."
Then I configure the load balancer to perform those health checks every couple of minutes on all the servers in the server pool (AWS calls the pool a "target group"). The health check operation detects "unhealthy" servers and temporarily takes them out of its rotation. (The health check also detects crashed servers, which is good.)
You need this loadbalancer health check for a large-scale production setup.
All that being said, you will get best results if all your server instances in your pool have roughly the same capacity as each other.
I need to setup a Web Farm with IIS 8/10.
I'm experienced in managing IIS on a single machine, but not in a load-balancing scenario.
I am planning to use Application Request Routing (ARR).
Can I set-up ARR in IIS with a rule stating « Send request to Server A, then, and if no answer was received within 3 seconds, then send request to server B »?
Do you think that ARR would be capable of handling such a scenario ?
Thank you very much,
In my opinion iis load balancing does not have this functionality.
by using iis load balancing you can implement the below thing:
if you have server A and server B.
the request goes to the iis and if server A is not available it will redirect the request to server B and it will serve the response.
ARR has a time-out value which is proxy time-out. by using this request will wait for the backend server to respond to the request made. if it times out connect will be closed.
You could refer to below link for more detail about iis load balancing:
https://learn.microsoft.com/en-us/iis/extensions/configuring-application-request-routing-arr/http-load-balancing-using-application-request-routing
I had been doing some load testing on SignalR server. According to my test case a Self Hosted SignalR server can handle only 20,000 concurrent requests at a time.
When SignalR has 20,000 open connections, the process consumes about 1.5 GB of RAM (I think that is too much). And when the connections exceed 22,000 the new clients get connection timeout error. Server never runs out of memory, just stops responding to new requests.
I'm aware of Server Farming, and that I can use that it in SignalR using BackPlane, but I'm concerned about Vertical Scaling here. I have achieved 25,000 connections using long polling (async asp.net handlers). I suppose signalR should be able to achieve more concurrent requests as it uses WebSockets.
Is there something I can do to have about 50,000 concurrent connections per node of SignalR? This performance tuning is of no help because I'm using Owin self-hosting. What can I do so that my server application takes less memory per connection?
I want to build a asp.net web api server that can re-route incoming HTTP request to other web api servers. Main server will be master and its only job will be accepting and routing to other servers. Slave servers will inform master server when they started and ready to accept http requests. Slave servers must not only inform they alive but they should send which api they support. I think I have to re-map routing tables in master server in runtime. Is it possible?
This seems like load balancing according to functionality . Is there any way to do this ? I have to write a load balancer for web api any suggestion is welcome.
I have a server which supports web sockets. Browsers connect to my site and each one opens a web socket to www.mydomain.example. That way, my social network app can push messages to the clients.
Traditionally, using just HTTP requests, I would scale up by adding a second server and a load balancer in front of the two web servers.
With web sockets, the connection has to be directly with the web server, not the load balancers, because if a machine has a physical limit of say 64k open ports, and the clients were connecting to the load balancer, then I couldn't support more than 64k concurrent users.
So how do I:
get the client to connect directly to the web server (rather than the load balancer) when the page loads? Do I simply load the JavaScript from a node, and the load balancers (or whatever) randomly modifies the URL for the script, every time the page is initially requested?
handle a ripple start? The browser will notice that the connection is closed as the web server shuts down. I can write JavaScript code to attempt to reopen the connection, but the node will be gone for a while. So I guess I would have to go back to the load balancer to query the address of the next node to use?
I did wonder about the load balancers sending a redirect on the initial request, so that the browser initially requests www.mydomain.example and gets redirected to www34.mydomain.example. That works quite well, until the node goes down - and sites like Facebook don't do that. How do they do it?
Put a L3 load-balancer that distributes IP packets based on source-IP-port hash to your WebSocket server farm. Since the L3 balancer maintains no state (using hashed source-IP-port) it will scale to wire speed on low-end hardware (say 10GbE). Since the distribution is deterministic (using hashed source-IP-port), it will work with TCP (and hence WebSocket).
Also note that a 64k hard limit only applies to outgoing TCP/IP for a given (source) IP address. It does not apply to incoming TCP/IP. We have tested Autobahn (a high-performance WebSocket server) with 200k active connections on a 2 core, 4GB RAM VM.
Also note that you can do L7 load-balancing on the HTTP path announced during the initial WebSocket handshake. In that case the load balancer has to maintain state (which source IP-port pair is going to which backend node). It will probably scale to millions of connections nevertheless on decent setup.
Disclaimer: I am original author of Autobahn and work for Tavendo.
Note that if your websocket server logic runs on nodejs with socket.io, you can tell socket.io to use a shared redis key/value store for synchronization.
This way you don't even have to care about the load balancer, events will propagate among the server instances.
var io = require('socket.io')(3000);
var redis = require('socket.io-redis');
io.adapter(redis({ host: 'localhost', port: 6379 }));
See: Socket IO - Using multiple nodes
But at some point I guess redis can become the bottleneck...
You can also achieve layer 7 load balancing with inspection and "routing functionality"
See "How to inspect and load-balance WebSockets traffic using Stingray Traffic Manager, and when necessary, how to manage WebSockets and HTTP traffic that is received on the same IP address and port." https://splash.riverbed.com/docs/DOC-1451