Blockers for maximum number of http requests from a pod - go

I have a Go app which is deployed to two 8 core pods instances on Kubernetes.
From it, I receive a list of ids than later on I use to retrieve some data from another service by sending each id to a POST endpoint.
I am using a bounded concurrency pattern to have a maximum number of simulataneous goroutines (and therefore, of requests) to this external service.
I set the limit of concurrency as:
sem := make(chan struct{}, MAX_GO_ROUTINES)
With this setup I started playing around with the MAX_GO_ROUTINES number by increasing it. I usually receive around 20000 ids to check. So I have played around by setting MAX_GO_ROUTINES from anywhere between 100 and 20000.
One thing I notice is as I go higher and higher some requests start to fail with the message: connection reset from this external service.
So my questions are:
What is the blocker in this case?
What is the limit of concurrent HTTP POST requests a server with 8 cores and 4GB of ram can send? Is it a memory limit? or file descriptors limit?
Is the error I am getting coming from my server or from the external one?

What is the blocker in this case?
As the comment mentioned: HTTP "connection reset" generally means:
the connection was unexpectedly closed by the peer. The server appears
to have dropped the connection on the unsuspecting HTTP client before
sending back a response. This is most likely due to the high load.
Most webservers (like nginx) have a queue where they stage connections while they wait to be serviced. When the queue exceeds some limit the connections may be shed and "reset". So it's most likely this is your upstream service being saturated (i.e. your app sends more requests than it can service and overloads its queue)
What is the limit of concurrent HTTP POST requests a server with 8 cores and 4GB of ram can send? Is it a memory limit? or file descriptors limit?
All :) At some point your particularl workload will overload a logical limit (like file descriptors) or a "physical" limit like memory. Unfortunately the only way to truly understand which resource is going to be exhausted (and which constraints you hit up against) is to run tests and profile and benchmark your workload :(
Is the error I am getting coming from my server or from the external one?
HTTP Connection reset is most likely the external, it indicates the connection peer (the upstream service) reset the connection.

Related

How to fix HTTP/2.0 504 Gateway Timeout for multi simultaneous XHR connections when using HTTP/2

I activate HTTP/2 support on my server. Now i got the problem with AJAX/jQuery scipts like upload or Api handling.
After max_input_time of 60sec for php i got: [HTTP/2.0 504 Gateway Timeout 60034ms]
with HTTP/1 only a few connections where startet simultaneously and when one is finished a nother starts.
with HTTP/2 all starts at once.
when fore example 100 images would uploaded it takes to long for all.
I don't wish to change the max_input_time. I hope to limit the simultaneous connections in the scripts.
thank you
HTTP/2 intentionally allows multiple requests in parallel. This differs from HTTP/1.1 which only allowed one request at a time (but which browsers compensated for by opening 6 parallel connections). The downside to drastically increasing that limit is you can have more requests on the go at once, contending for bandwidth.
You’ve basically two choices to resolve this:
Change your application to throttle uploads rather than expecting the browser or the protocol to do this for you.
Limit the maximum number of concurrent streams allowed by your webserver. In Apache for example, this is controlled by the H2MaxSessionStreams Directive while in Nginx it is similarly controlled by the
http2_max_concurrent_streams config. Other streams will need to wait.

Using HTTP2 how can I limit the number of concurrent requests?

We have a system client <-> server working over HTTP1.1. The client is making hundreds (sometimes thousands) of concurrent requests to the server.
Because the default limitations of the browsers to HTTP1.1 connections the client is actually making these requests in batches of (6 ~ 8) concurrent requests, we think we can get some performance improvement if we can increase the number of concurrent requests.
We moved the system to work over HTTP2 and we see the client requesting all the requests simultaneously as we wanted.
The problem now is the opposite: the server can not handle so many concurrent requests.
How can we limit the number of concurrent request the Client is doing simultaneous to something more manageable for the server? let's say 50 ~ 100 concurrent requests.
We were assuming that HTTP2 can allow us to graduate the number of concurrent
connections:
https://developers.google.com/web/fundamentals/performance/http2/
With HTTP/2 the client remains in full control of how server push is
used. The client can limit the number of concurrently pushed streams;
adjust the initial flow control window to control how much data is
pushed when the stream is first opened; or disable server push
entirely. These preferences are communicated via the SETTINGS frames
at the beginning of the HTTP/2 connection and may be updated at any
time.
Also here:
https://stackoverflow.com/a/36847527/316700
O maybe, if possible, we can limit this in the server side (what I think is more maintainable).
But looks like these solutions are talking about Server Push and what we have is the Client Pulling.
In case help in any way our architecture looks like this:
Client ==[http 2]==> ALB(AWS Beanstalk) ==[http 1.1]==> nginx ==[http 1.0]==> Puma
There is special setting in SETTINGS frame
You can specify SETTINGS_MAX_CONCURRENT_STREAMS to 100 on the server side
Reference

How many SSE connections can a web server maintain?

I'm experimenting with server-sent events (SSE) as an alternative to websockets for real-time data pushing (data in my application is primarily one-directional).
How scalable would this be? I know that each SSE connection uses an HTTP request -- does this mean that a web server can handle as many SSE connections as HTTP requests (something like this answer)? I feel as though this might be the case, but I'm not sure how a SSE connection works and if it is substantially more complex/resource-hungry than a simple HTTP request.
I'm mostly wondering how this compares to the number of concurrent websockets a browser can keep open. This answer suggests that only ~1400-1800 sockets can be handled by a server at the same time.
Can someone provide some insight on this?
(To clarify, I am not asking about how many SSE connections can be kept open from the client; I am asking about how many can be reasonably kept open by a web server.)
Tomcat 8 (web server to give an example) and above that uses the NIO connector for handling incoming requst. It can service max 10,000 concurrent connections(docs). It does not say anything about max connections pers se. They also provide another parameter called acceptCount which is the fall back if connections exceed 10,000.
socket connections are treated as files. Every incoming connection to tomcat is like opening a socket and depending on the OS e.g in linux depends on the file-descriptor policy. You will find a common error when too many connections are open or max connections have been reached as the following
java.net.SocketException: Too many files open
You can change the number of open files by editing
/etc/security/limits.conf
It is not clear what is max limit that is allowed. Some say default for tomcat is 1096 but the (default) one for linux is 30,000 which can be changed.
On the article I have shared the linkedIn team were able to go 250K connections on one host.
So that should give you a pretty good idea about max sse connections possible. depends on your web server max connection configuration, OS capacity etc.

Slow HTTP vs Web Sockets - Resource utilization

If a bunch of "Slow HTTP" connection to a server can consume so much resources so as to cause a denial of service, why wouldn't a bunch of web sockets to a server cause the same problem?
The accepted answer to a different SO question says that it is almost free to maintain a idle connection.
If it costs nothing to maintain an open TCP connection, why does a "Slow HTTP" cause denial of service?
A WebSocket and a "slow" HTTP connection both use an open connection. The difference is in expectations of the server design.
Typical HTTP servers do not need to handle a large number of open connections and are designed around the assumption that the number of open connections is small. If the server does not protect against slow clients, then an attacker can force a server designed around this assumption to hit a resource limit.
Here are a couple of examples showing how the different expectations can impact the design:
If you only have a few HTTP requests in flight at a time, then it's OK to use a thread per connection. This is not a good design for a WebSocket server.
The default file descriptor limits are often adequate for typical HTTP scenarios, but not for a large numbers of connections.
It is possible to design an HTTP server to handle a large number of open connections and several servers do so out of the box.

Go(lang): about MaxIdleConnsPerHost in the http client's transport

In case MaxIdleConnsPerHost is set to a high number, let's say 1000, the number of connections open will still depend on the other host, right? I mean, allowing 1000 idle connections with the same host will result in 1000 connections open as long as these are not closed by the other host?
So, effectively setting this value to a high number, will result in never closing a connection, but waiting for the other host to do it? am I interpreting this correctly?
Your understanding is correct. MaxIdleConnsPerHost restricts how many connections there are which are not actively serving requests, but which the client has not closed.
Idle connections are useful for web browsers because they can keep reusing connections for subsequent HTTP requests to the same server. Idle connections have a cost for the server, though. They use kernel resources, and you may run up against per process limits or kernel limits on the number of open connections, files, or handles, which may cause unexpected errors in your program, or even for other programs on the same machine.
As such, be careful when increasing MaxIdleConnsPerHost to a large number. It only makes sense to increase idle connections if you are seeing many connections in a short period from the same clients.

Resources