Using HTTP2 how can I limit the number of concurrent requests? - ajax

We have a system client <-> server working over HTTP1.1. The client is making hundreds (sometimes thousands) of concurrent requests to the server.
Because the default limitations of the browsers to HTTP1.1 connections the client is actually making these requests in batches of (6 ~ 8) concurrent requests, we think we can get some performance improvement if we can increase the number of concurrent requests.
We moved the system to work over HTTP2 and we see the client requesting all the requests simultaneously as we wanted.
The problem now is the opposite: the server can not handle so many concurrent requests.
How can we limit the number of concurrent request the Client is doing simultaneous to something more manageable for the server? let's say 50 ~ 100 concurrent requests.
We were assuming that HTTP2 can allow us to graduate the number of concurrent
With HTTP/2 the client remains in full control of how server push is
used. The client can limit the number of concurrently pushed streams;
adjust the initial flow control window to control how much data is
pushed when the stream is first opened; or disable server push
entirely. These preferences are communicated via the SETTINGS frames
at the beginning of the HTTP/2 connection and may be updated at any
Also here:
O maybe, if possible, we can limit this in the server side (what I think is more maintainable).
But looks like these solutions are talking about Server Push and what we have is the Client Pulling.
In case help in any way our architecture looks like this:
Client ==[http 2]==> ALB(AWS Beanstalk) ==[http 1.1]==> nginx ==[http 1.0]==> Puma

There is special setting in SETTINGS frame
You can specify SETTINGS_MAX_CONCURRENT_STREAMS to 100 on the server side


Blockers for maximum number of http requests from a pod

I have a Go app which is deployed to two 8 core pods instances on Kubernetes.
From it, I receive a list of ids than later on I use to retrieve some data from another service by sending each id to a POST endpoint.
I am using a bounded concurrency pattern to have a maximum number of simulataneous goroutines (and therefore, of requests) to this external service.
I set the limit of concurrency as:
sem := make(chan struct{}, MAX_GO_ROUTINES)
With this setup I started playing around with the MAX_GO_ROUTINES number by increasing it. I usually receive around 20000 ids to check. So I have played around by setting MAX_GO_ROUTINES from anywhere between 100 and 20000.
One thing I notice is as I go higher and higher some requests start to fail with the message: connection reset from this external service.
So my questions are:
What is the blocker in this case?
What is the limit of concurrent HTTP POST requests a server with 8 cores and 4GB of ram can send? Is it a memory limit? or file descriptors limit?
Is the error I am getting coming from my server or from the external one?
What is the blocker in this case?
As the comment mentioned: HTTP "connection reset" generally means:
the connection was unexpectedly closed by the peer. The server appears
to have dropped the connection on the unsuspecting HTTP client before
sending back a response. This is most likely due to the high load.
Most webservers (like nginx) have a queue where they stage connections while they wait to be serviced. When the queue exceeds some limit the connections may be shed and "reset". So it's most likely this is your upstream service being saturated (i.e. your app sends more requests than it can service and overloads its queue)
What is the limit of concurrent HTTP POST requests a server with 8 cores and 4GB of ram can send? Is it a memory limit? or file descriptors limit?
All :) At some point your particularl workload will overload a logical limit (like file descriptors) or a "physical" limit like memory. Unfortunately the only way to truly understand which resource is going to be exhausted (and which constraints you hit up against) is to run tests and profile and benchmark your workload :(
Is the error I am getting coming from my server or from the external one?
HTTP Connection reset is most likely the external, it indicates the connection peer (the upstream service) reset the connection.

Do we still need a connection pool for microservices talking HTTP2?

As HTTP2 supports multiplexing, do we need still a pool of connections for microservice communication?
If yes, what are the benefits of having such a pool?
Service A => Service B
Both the above services have only one instance available.
Multiple connections may help overcome OS buffer size limitation for each Connection(Socket)? What else?
Yes, you still need connection pool in a client contacting a microservice.
First, in general it's the server that controls the amount of multiplexing. A particular microservice server may decide that it cannot allow beyond a very small multiplexing.
If a client wants to use that microservice with a higher load, it needs to be prepared to open multiple connections and this is where the connection pool comes handy.
This is also useful to handle load spikes.
Second, HTTP/2 has flow control and that may severely limit the data throughput on a single connection. If the flow control window are small (the default defined by the HTTP/2 specification is 65535 bytes, which is typically very small for microservices) then client and server will spend a considerable amount of time exchanging WINDOW_UPDATE frames to enlarge the flow control windows, and this is detrimental to throughput.
To overcome this, you either need more connections (and again a client should be prepared for that), or you need larger flow control windows.
Third, in case of large HTTP/2 flow control windows, you may hit TCP congestion (and this is different from socket buffer size) because the consumer is slower than the producer. It may be a slow server for a client upload (REST request with a large payload), or a slow client for a server download (REST response with a large payload).
Again to overcome TCP congestion the solution is to open multiple connections.
Comparing HTTP/1.1 with HTTP/2 for the microservice use case, it's typical that the HTTP/1.1 connection pools are way larger (e.g. 10x-50x) than HTTP/2 connection pools, but you still want connection pools in HTTP/2 for the reasons above.
[Disclaimer I'm the HTTP/2 implementer in Jetty].
We had an initial implementation where the Jetty HttpClient was using the HTTP/2 transport with an hardcoded single connection per domain because that's what HTTP/2 preached for browsers.
When exposed to real world use cases - especially microservices - we quickly realized how bad of an idea that was, and switched back to use connection pooling for HTTP/2 (like HttpClient always did for HTTP/1.1).

How to fix HTTP/2.0 504 Gateway Timeout for multi simultaneous XHR connections when using HTTP/2

I activate HTTP/2 support on my server. Now i got the problem with AJAX/jQuery scipts like upload or Api handling.
After max_input_time of 60sec for php i got: [HTTP/2.0 504 Gateway Timeout 60034ms]
with HTTP/1 only a few connections where startet simultaneously and when one is finished a nother starts.
with HTTP/2 all starts at once.
when fore example 100 images would uploaded it takes to long for all.
I don't wish to change the max_input_time. I hope to limit the simultaneous connections in the scripts.
thank you
HTTP/2 intentionally allows multiple requests in parallel. This differs from HTTP/1.1 which only allowed one request at a time (but which browsers compensated for by opening 6 parallel connections). The downside to drastically increasing that limit is you can have more requests on the go at once, contending for bandwidth.
You’ve basically two choices to resolve this:
Change your application to throttle uploads rather than expecting the browser or the protocol to do this for you.
Limit the maximum number of concurrent streams allowed by your webserver. In Apache for example, this is controlled by the H2MaxSessionStreams Directive while in Nginx it is similarly controlled by the
http2_max_concurrent_streams config. Other streams will need to wait.

How many SSE connections can a web server maintain?

I'm experimenting with server-sent events (SSE) as an alternative to websockets for real-time data pushing (data in my application is primarily one-directional).
How scalable would this be? I know that each SSE connection uses an HTTP request -- does this mean that a web server can handle as many SSE connections as HTTP requests (something like this answer)? I feel as though this might be the case, but I'm not sure how a SSE connection works and if it is substantially more complex/resource-hungry than a simple HTTP request.
I'm mostly wondering how this compares to the number of concurrent websockets a browser can keep open. This answer suggests that only ~1400-1800 sockets can be handled by a server at the same time.
Can someone provide some insight on this?
(To clarify, I am not asking about how many SSE connections can be kept open from the client; I am asking about how many can be reasonably kept open by a web server.)
Tomcat 8 (web server to give an example) and above that uses the NIO connector for handling incoming requst. It can service max 10,000 concurrent connections(docs). It does not say anything about max connections pers se. They also provide another parameter called acceptCount which is the fall back if connections exceed 10,000.
socket connections are treated as files. Every incoming connection to tomcat is like opening a socket and depending on the OS e.g in linux depends on the file-descriptor policy. You will find a common error when too many connections are open or max connections have been reached as the following Too many files open
You can change the number of open files by editing
It is not clear what is max limit that is allowed. Some say default for tomcat is 1096 but the (default) one for linux is 30,000 which can be changed.
On the article I have shared the linkedIn team were able to go 250K connections on one host.
So that should give you a pretty good idea about max sse connections possible. depends on your web server max connection configuration, OS capacity etc.

High Performance Options for Remote services access

I have a service, foo, running on machine A. I need to access that service from machine B. One way is to launch a web server on A and do it via HTTP; code running under web server on A accesses foo and returns the results. Another is to write socket server on A; socket server access service foo and returns the result.
HTTP connection initiation and handshake is expensive; sockets can be written, but I want to avoid that. What other options are available for high performance remote calls?
HTTP is just the protocol over the socket. If you are using TCP/IP networks, you are going to be using a socket. The HTTP connection initiation and handshake are not the expensive bits, it's TCP initiation that's really the expensive bit.
If you use HTTP 1.1, you can use persistent connections (Keep-Alive), which drastically reduces this cost, closer to that of keeping a persistent socket open.
It all depends on whether you want/need the higher-level protocol. Using HTTP means you will be able to do things like consume this service from a lot more clients, while writing much less documentation (if you write your own protocol, you will have to document it). HTTP servers also supports things like authentication, cookies, logging, out of the box. If you don't need these sorts of capabilities, then HTTP might be a waste. But I've seen few projects that don't need at least some of these things.
Adding to the answer of #Rob, as the question is not precisely pointing to an application or performance boundaries, it would be good to look into the options available in a broader context, which is Inter process communication.
The wikipedia page cleanly lists down the options available and would be a good place to start with.
What technology are you going to use? Let me answer for Java world.
If your request rate is below 100/sec, you should not care about optimizations and use most versatile solution - HTTP.
Well-written asynchronous server like Netty-HTTP can easily handle 1000 requests per second on medium-level hardware.
If you need more or have constrained resources, you can go to binary format. Most popular one out there is Google Protobuf(multilanguage) + Netty (Java).
What you should know about HTTP performance:
Http can use Keep-Alive which removes reconnection cost for every request
Http adds traffic overhead for every request and response - around 50-100 bytes.
Http client and server consumes additional CPU for parsing HTTP headers - that is noticeable after abovementioned 100 req/sec
Be careful when selecting technology. Even in 21 century it is hard to find well-written HTTP server and client.
