I am struggling to find information on how to gauge the scalability of websockets. A scenario -
Let's say from client wants to establish socket connection from a browser, and the client application and service layer (Micronaut) both have two instances behind an elb - service layer will sit us-east region and can expect anyone from around the world can access the frontend app from browser and can expect an open connection for an avg of 2-5 min, no longer than 30 minutes.
Is there a ballpark number on how many concurrent websocket connections a couple servers can handle? Or if there are certain factors that I didn't mention that are vital to handling websocket connections in general?
Thank you in advance.
I'm assuming you want to know the scalability of the implementation of WS in Micronaut and not WS in general. Of course, the scalability of WS is dependent on the specific implementation and WS itself. You probably already know this, but wanted to state it for the record. You may also want to be sure you increase your file descriptors for your server process to the max number (you may have to adjust your kernel to increase the FDs).
Btw, don't forget to handle retries and reconnects as you would for a low-level TCP connection
Related
As HTTP2 supports multiplexing, do we need still a pool of connections for microservice communication?
If yes, what are the benefits of having such a pool?
Example:
Service A => Service B
Both the above services have only one instance available.
Multiple connections may help overcome OS buffer size limitation for each Connection(Socket)? What else?
Yes, you still need connection pool in a client contacting a microservice.
First, in general it's the server that controls the amount of multiplexing. A particular microservice server may decide that it cannot allow beyond a very small multiplexing.
If a client wants to use that microservice with a higher load, it needs to be prepared to open multiple connections and this is where the connection pool comes handy.
This is also useful to handle load spikes.
Second, HTTP/2 has flow control and that may severely limit the data throughput on a single connection. If the flow control window are small (the default defined by the HTTP/2 specification is 65535 bytes, which is typically very small for microservices) then client and server will spend a considerable amount of time exchanging WINDOW_UPDATE frames to enlarge the flow control windows, and this is detrimental to throughput.
To overcome this, you either need more connections (and again a client should be prepared for that), or you need larger flow control windows.
Third, in case of large HTTP/2 flow control windows, you may hit TCP congestion (and this is different from socket buffer size) because the consumer is slower than the producer. It may be a slow server for a client upload (REST request with a large payload), or a slow client for a server download (REST response with a large payload).
Again to overcome TCP congestion the solution is to open multiple connections.
Comparing HTTP/1.1 with HTTP/2 for the microservice use case, it's typical that the HTTP/1.1 connection pools are way larger (e.g. 10x-50x) than HTTP/2 connection pools, but you still want connection pools in HTTP/2 for the reasons above.
[Disclaimer I'm the HTTP/2 implementer in Jetty].
We had an initial implementation where the Jetty HttpClient was using the HTTP/2 transport with an hardcoded single connection per domain because that's what HTTP/2 preached for browsers.
When exposed to real world use cases - especially microservices - we quickly realized how bad of an idea that was, and switched back to use connection pooling for HTTP/2 (like HttpClient always did for HTTP/1.1).
I'm experimenting with server-sent events (SSE) as an alternative to websockets for real-time data pushing (data in my application is primarily one-directional).
How scalable would this be? I know that each SSE connection uses an HTTP request -- does this mean that a web server can handle as many SSE connections as HTTP requests (something like this answer)? I feel as though this might be the case, but I'm not sure how a SSE connection works and if it is substantially more complex/resource-hungry than a simple HTTP request.
I'm mostly wondering how this compares to the number of concurrent websockets a browser can keep open. This answer suggests that only ~1400-1800 sockets can be handled by a server at the same time.
Can someone provide some insight on this?
(To clarify, I am not asking about how many SSE connections can be kept open from the client; I am asking about how many can be reasonably kept open by a web server.)
Tomcat 8 (web server to give an example) and above that uses the NIO connector for handling incoming requst. It can service max 10,000 concurrent connections(docs). It does not say anything about max connections pers se. They also provide another parameter called acceptCount which is the fall back if connections exceed 10,000.
socket connections are treated as files. Every incoming connection to tomcat is like opening a socket and depending on the OS e.g in linux depends on the file-descriptor policy. You will find a common error when too many connections are open or max connections have been reached as the following
java.net.SocketException: Too many files open
You can change the number of open files by editing
/etc/security/limits.conf
It is not clear what is max limit that is allowed. Some say default for tomcat is 1096 but the (default) one for linux is 30,000 which can be changed.
On the article I have shared the linkedIn team were able to go 250K connections on one host.
So that should give you a pretty good idea about max sse connections possible. depends on your web server max connection configuration, OS capacity etc.
During a designing of a client/server architecture, is there any advantage to multiplexing multiple WEBSOCKET connections from the same process to the server (i.e. sharing one connection) vs opening one WEBSOCKET connection per thread/session in the client (as is typically done when connecting to memcached or database servers.)
I'm aware about the overhead associated with each connection (e.g. RAM ...). But expect to have less than 1K-10K at the most in each client side.
Specific use case:
Lets assume, I have a remote server with multiple sessions on one side, and on the other side I have multiple clients, each client will connect to a different session through the websocket server.
In the remote server, there are 2 ways to implement it: (1) each session create its own websocket connection (2) all sessions will use same websocket connection.
From connection point of view, I like the sharing solution (one websocket connection to all sessions), because websocket server is limited by #of connections available (saving servers/scaling).
But from traffic/data speed/performance point of view, if a sessions will send lots of small packages through the connection, then, if we use one sharing connection, we will not be able to utilize the bandwidth (payload..../collect few small packages into one or split big package into small packages), because we may have to send different packages to different clients from different sessions, in this case, we will not be able to collect few packages (small packages) since they have different destination and from different sources!!, unless we will create "virtual connection" that manage each session data to maximize the speed, but this would create much implementation complexity!!!
Any other opinions?
Thanks,
JB.
I think you should consider using a limited connection pool, like they do with Database connection architecture.
Another solution I would consider is a Pub/Sub database middleman such as Redis. This allows you to use existing solutions as well as easier scalability.
To the best of my understanding, both having a single connection and using a multitude of connections have their issues.
For example, one connection can send only one message at a time. A big enough message could block the connection... are you moving big data?
Many connections can cause an overhead that could be very expensive as well as introduce more chances for errors. Consider the following:
Creating new connections is very expensive, uses bandwidth, suffers from longer network delays and requires local resources and this is exactly what websockets allows us to avoid.
You will run into scalability issues. For instance, Heroku limits websocket connections to 600 per server, or at least they did so a short while back (and I think it's reasonable)... How will you connect all the servers together to one data-store?
Remember every OS has an open file limit and that websockets use the IO architecture (each one is an 'open-file', so that websockets are a limited resource).
Regarding traffic/data speed/performance, it is a question of server architecture... but I believe you will actually see a slight speed increase by using one connection (or a small pool of connections). It's important to remember that there isn't any effective multi-tasking when you need to send TCP/IP packets.
Also, with a limited number of connections (even with one connection), you will be able to benefit from the OS's packet joining feature that will allow you to send a number of websocket frames over one TCP/IP packet (unless you constantly flush the TCP/IP socket). You will actually waste more bandwidth with more connections - even disregarding the bandwidth used to open each new connection.
Just my 5 cents, we will all think differently, I'm sure.
Good Luck!
If a bunch of "Slow HTTP" connection to a server can consume so much resources so as to cause a denial of service, why wouldn't a bunch of web sockets to a server cause the same problem?
The accepted answer to a different SO question says that it is almost free to maintain a idle connection.
If it costs nothing to maintain an open TCP connection, why does a "Slow HTTP" cause denial of service?
A WebSocket and a "slow" HTTP connection both use an open connection. The difference is in expectations of the server design.
Typical HTTP servers do not need to handle a large number of open connections and are designed around the assumption that the number of open connections is small. If the server does not protect against slow clients, then an attacker can force a server designed around this assumption to hit a resource limit.
Here are a couple of examples showing how the different expectations can impact the design:
If you only have a few HTTP requests in flight at a time, then it's OK to use a thread per connection. This is not a good design for a WebSocket server.
The default file descriptor limits are often adequate for typical HTTP scenarios, but not for a large numbers of connections.
It is possible to design an HTTP server to handle a large number of open connections and several servers do so out of the box.
I am doing some performance testing with a large number of threads. Each thread is sending HTTP requests to another IP. It looks like at some stages the connections are closed (because there are too many threads) and then of course have to be reopned.
I am looking to get some ball park figures for how long it takes windows to Open TCP connections.
Is there any way I can get this?
Thanks.
This is highly dependent on the endpoints you're trying to connect to, is it not?
As an extreme best case, you can test it yourself by targeting an IIS on localhost.
I wouldn't be surprised if routers and servers that you are connecting through may drop connections as a measure against what could be perceived as connection storms or even denial-of-service attacks.