Aiohttp server max connections

Aiohttp server max connections - python-asyncio

I cannot understand the reason aiohttp (and asyncio in general) server implementation does not provide a way to limit max concurrent connections limit (number of accepted sockets, or number of running requests handlers).
(https://github.com/aio-libs/aiohttp/issues/675). Without this limit, it is easy to run out of memory and/or file descriptors.
In the same time, aiohttp client by default limits number of concurrent requests to 100 (https://docs.aiohttp.org/en/stable/client_advanced.html#limiting-connection-pool-size), aiojobs limits number of running tasks and size of pending tasks list, nginx has worker_connections limit, any sync framework is limited by number of worker threads by design.
While aiohttp can handle a lot of concurrent requests, this number is still limited. Docs on aiojobs says "The Scheduler has implied limit for amount of concurrent jobs (100 by default). ... It prevents a program over-flooding by running a billion of jobs at the same time". And still, we can happily spawn "billion" (well, until we run out of resources) aiohttp handlers.
So the question is, why is it implemented the way it is? Am I missing some important detail? I think we can somehow pause requests handlers using Semafor, but the socket is still accepted by aiohttp and coroutine is spawned, in contrast with nginx. Also when deploying behind nginx, the number of worker_connections and aiohttp desired limit will certainly be different.(because nginx may serve static files also)

Based on the developers' comments on the linked issue, the reasons for this choice are the following:
The application can return a 4xx or 5xx response if it detects that the number of connections is larger than what it can reasonably handle. (This differs from the Semaphore idiom, which would effectively queue the connection.)
Throttling the number of server connections is more complicated than just specifying a number, because the limit might well depend on what your coroutines are doing, i.e. it should at least be path-based. Andrew Svetlov links to NGINX documentation about connection limiting to support this.
It is anyway recommended to put aiohttp behind a specialized front server such as NGINX.
More detail than this can only be provided by the developer(s), who have been known to read this tag.
At this point, it appears that the recommended solution is to either use a reverse proxy for limiting, or an application-based limit like this decorator (untested):
REQUEST_LIMIT = 100
def throttle_handle(real_handle):
_nrequests = 0
async def handle(request):
nonlocal _nrequests
if _nrequests >= REQUEST_LIMIT:
return aiohttp.web.Response(
status=429, text="Too many connections")
_nrequests += 1
try:
return await real_handle(request)
finally:
_nrequests -= 1
return handle
#throttle_handle
async def handle(request):
... your handler here ...

To limit concurrent connections you can use aiohttp.TCPConnector or aiohttp.ProxyConnector if you using proxy. Just create it in a session instead of using the default.
aiohttp.ClientSession(
connector=aiohttp.TCPConnector(limit=1)
)
aiohttp.ClientSession(
connector=aiohttp.ProxyConnector.from_url(proxy_url, limit=1)
)

Related

aiohttp Error Rate Increases with Number of Connections

I am trying to get the status code from millions of different sites, I am using asyncio and aiohttp, I run the below code with a different number of connections (yet same timeout on the request) but get very different results specifically much higher number of the following exception.
'concurrent.futures._base.TimeoutError'
The code
import pandas as pd
import asyncio
import aiohttp
out = []
CONNECTIONS = 1000
TIMEOUT = 10
async def fetch(url, session, loop):
try:
async with session.get(url,timeout=TIMEOUT) as response:
res = response.status
out.append(res)
return res
except Exception as e:
_exception = 'Error: '+str(type(e))
out.append(_exception)
return _exception
async def bound_fetch(sem, url, session, loop):
async with sem:
await fetch(url, session, loop)
async def run(urls, loop):
tasks = []
sem = asyncio.Semaphore(value=CONNECTIONS,loop=loop)
_connector = aiohttp.TCPConnector(limit=CONNECTIONS, loop=loop)
async with aiohttp.ClientSession(connector=_connector,loop=loop) as session:
for url in urls:
task = asyncio.ensure_future(bound_fetch(sem, url, session, loop))
tasks.append(task)
responses = await asyncio.gather(*tasks,return_exceptions=True)
return responses
## BEGIN ##
tlds = open('data/sample_1k.txt').read().splitlines()
urls = ['http://{}'.format(x) for x in tlds[1:]]
loop = asyncio.get_event_loop()
future = asyncio.ensure_future(run(urls,loop))
ans = loop.run_until_complete(future)
print(str(pd.Series(out).value_counts()))
Results
CONNECTIONS=1000
CONNECTIONS=100
Is this a bug? These sites do response with a status code and run sequentially or with lower connections there is no timeout error so why is this happening? The other exceptions seem stable as you change number of connections. The ClientOSErrors are from sites that actually timeout or respond, honestly don't really know where the concurrent.futures._base.TimeoutError errors are coming from.

Imagine you opened 1000 urls in browser simultaneously. I bet you'll notice many of them aren't loaded after 10 seconds. It's not a bug it's a limit of your machine resources.
More parallel requests you're doing -> less network capacity for each one, less CPU time for each one, less RAM for each one -> higher chances each request wouldn't be ready before it's timeout.
If you see there are many timeouts with 1000 connections, make less connections (and may be increase timeout). Based on aiohttp documentation using different ClientSession instancies may also help:
Unless you are connecting to a large, unknown number of different
servers over the lifetime of your application, it is suggested you use
a single session for the lifetime of your application

I've had the same issue, have a look at the details of the ClientOSErrors and you might see Too many open files, if so you need to increase the OS's number of file descriptors.
Either way, you'll get more information if you print the whole exceptions, not just their types.

How can i terminate myself if i run too long?

I have a application that runs periodically (it's a scheduled task). The task is launched once a minute, and normally only takes a few seconds to do its business, then exits.
But there's a ~1 in 80,000 chance (every two or three months) that the application will hang. The root cause is because we're using Microsoft ServerXmlHttpRequest component to perform some work, and sometimes it just decides to hang. The virtue of ServerXmlHttpRequest over XmlHttpRequest is that the latter is not recommended for important scenarios, such as where reliability and security are important (which is true of an unattended server component):
The ServerXMLHTTP object offers functionality similar to that of the XMLHTTP object. Unlike XMLHTTP, however, the ServerXMLHTTP object does not rely on the WinInet control for HTTP access to remote XML documents. ServerXMLHTTP uses a new HTTP client stack. Designed for server applications, this server-safe subset of WinInet offers the following advantages:
Reliability — The HTTP client stack offers longer uptimes. WinInet features that are not critical for server applications, such as URL caching, auto-discovery of proxy servers, HTTP/1.1 chunking, offline support, and support for Gopher and FTP protocols are not included in the new HTTP subset.
Security — The HTTP client stack does not allow a user-specific state to be shared with another user's session. ServerXMLHTTP provides support for client certificates.
The job is being run as a scheduled task. I need the task to continue to run periodically; killing the existing process if it's dead.
The Windows Task Scheduler does have an option for forcibly close a task that is running too long:
The only downside to that approach is that it simply doesn't work - it simply does not stop the task. The hung process keeps running.
Given that i cannot trust the Microsoft ServerXmlHttpRequest to not arbitrarily lock up, and the task scheduler is unable to terminate the scheduled task, i need some way to do it myself.
Jobs
I tried looking into using the Job Objects API:
A job object allows groups of processes to be managed as a unit. Job objects are namable, securable, sharable objects that control attributes of the processes associated with them. A job can enforce limits such as working set size, process priority, and end-of-job time limit on each process that is associated with the job.
That one note sounded like exactly what i needed:
A job can enforce limits such as end-of-job time limit on each process that is associated with the job.
The only down-side to that approach is that it does not work. Job cannot impose a time-limit on a process. They can only impose a user time limit on a process:
PerProcessUserTimeLimit
If LimitFlags specifies JOB_OBJECT_LIMIT_PROCESS_TIME, this member is the per-process user-mode execution time limit, in 100-nanosecond ticks.
If the process is idle (for example, sitting at a MsgWaitForSingleObject as ServerXmlHttpRequest is), then it will accumulate no user time. I tested it. I created a job with a 1 second time limit, and placed my self process into it. As long as i don't move the mouse around my test application, it quite happily sits there for longer than one second.
Watchdog Thread
The only other technique i can imagine, given that my main thread is indefinitely blocked, is another thread. The only solution i can imagine is spawn another thread that will sleep for my three minutes, then ExitProcess:
Int32 watchdogTimeoutSeconds = FindCmdLineSwitch("watchdog", 0);
if (watchdogTimeoutSeconds > 0)
Thread thread = new Thread(KillMeCallback, new IntPtr(watchdogTimeoutSeconds));
void KillMeCallback(IntPtr data)
{
Int32 secondsUntilProcessIsExited = data.ToInt32();
if (secondsUntilProcessIsExited <= 0)
return;
Sleep(secondsUntilProcessIsExited*1000); //seconds --> milliseconds
LogToEventLog(ExtractFilename(Application.ExeName),
"Watchdog fired after "+secondsUntilProcessIsExited.ToString()+" seconds. Process will be forcibly exited.", EVENTLOG_WARNING_TYPE, 999);
ExitProcess(999);
}
And that works. The only downside is that it's a bad idea.
Can anyone think of anything better?
Edit
For now i will implement a
Contoso.exe /watchdog 180
So the process will be exited after 180 seconds. It means the duration is configurable, or can be removed completely easily in the field.

I used the route where i pass a special WatchDog argument to my process on the command line;
>Contoso.exe /watchdog 180
During initialization i check for the presence of the WatchDog option, with an integer number of seconds after it:
String s = Toolkit.FindCmdLineOption("watchdog", ["/", "-"]);
if (s <> "")
{
Int32 seconds = StrToIntDef(s, 0);
if (seconds > 0)
RunInThread(WatchdogThreadProc, Pointer(seconds));
}
and my thread procedure:
void WatchdogProc(Pointer Data);
{
Int32 secondsUntilProcessIsExited = Int32(Data);
if (secondsUntilProcessIsExited <= 0)
return;
Sleep(secondsUntilProcessIsExited*1000); //seconds -> milliseconds
LogToEventLog(ExtractFileName(ParamStr(0)),
Format("Watchdog fired after %d seconds. Process will be forcibly exited.", secondsUntilProcessIsExited),
EVENTLOG_WARNING_TYPE, 999);
ExitProcess(2);
}

Understanding Celluloid Pool

I guess my understanding toward Celluloid Pool is sort of broken. I will try to explain below but before that a quick note.
Note: Our system is running against a very fast client passing messages over ZeroMQ.
With the following Vanilla Celluloid app
class VanillaClient
include Celluloid::ZMQ
def read
loop { async.evaluate_response(socket.read_multipart)
end
def evaluate_response(data)
## the reason for using defer can be found over here.
Celluloid.defer do
ExternalService.execute(data)
end
end
end
Our system result in failure after some time, reason 'Can't spawn more thread' (or something like it)
So we intended to use Celluloid Pool(to avoid the above-mentioned problem ) so that we can limit the number of threads that spawned
My Understanding toward Celluloid Pool is
Celluloid Pool maintains a pool of actors for you so that you can distribute your task in parallel.
Hence, I decide to test it, but according to my test cases, it seems to behave serially(i.e thing never get distribute or happen in parallel.)
Example to replicate this.
sender-1.rb
## Send message `1` to the the_client.rb
sender-2.rb
## Send message `2` to the the_client.rb
the_client.rb
## take message from sender-1 and sender-2 and return it back to receiver.rb
## heads on, the `sleep` is introduced to test/replicate the IO block that happens in the actual code.
receiver.rb
## print the message obtained from the_client.rb
If, the sender-2.rb is run before sender-1.rb it appears that the pool gets blocked for 20 sec (sleep time in the_client.rb,can be seen over here) before consuming the data sent by sender-1.rb
It behaves the same in ruby-2.2.2 and under jRuby-9.0.5.0. What could be the possible causes for Pool to act in such manner?

Your pool call is not asynchronous.
Execution of evaluate on #pool needs to be .async still, as in your original example, not using pools. You still want asynchronous behavior, but you als want to have multiple handler actors.
Next you will likely hit the Pool.async bug.
https://github.com/celluloid/celluloid-pool/issues/6
This means after 5 hits to evaluate your pool will become unresponsive until at least one actor in the pool is finished. Worst case scenario, if you get 6+ requests in rapid succession, the 6th will then take 120 seconds, because it will take 5*20 seconds before it executes, then 20 seconds to execute itself.
Depending on what your actual operation is that's causing you delays -- you might need to adjust your pool size down the line.

Asynchronous IO server : Thin(Ruby) and Node.js. Any difference?

I wanna clear my concept of asynchronous IO, non-blocking server
When dealing with Node.js , it is easy to under the concept
var express = require('express');
var app = express();
app.get('/test', function(req, res){
setTimeout(function(){
console.log("sleep doesn't block, and now return");
res.send('success');
}, 2000);
});
var server = app.listen(3000, function() {
console.log('Listening on port %d', server.address().port);
});
I know that when node.js is waiting for 2 seconds of setTimeout, it is able to serve another request at the same time, once the 2 seconds is passed, it will call it callback function.
How about in Ruby world, thin server?
require 'sinatra'
require 'thin'
set :server, %w[thin]
get '/test' do
sleep 2 <----
"success"
end
The code snippet above is using Thin server (non-blocking, asynchronous IO), When talking to asynchronous IO, i want to ask when reaching sleep 2 , is that the server are able to serve another request at the same time as sleep 2 is blocking IO.
The code between node.js and sinatra is that
node.js is writing asynchronous way (callback approach)
ruby is writing in synchronous way (but working in asynchronous way under the cover? is it true)
If the above statement is true,
it seems that ruby is better as the code looks better rather than bunch of callback code in node.js
Kit

Sinatra / Thin
Thin will be started in threaded mode,
if it is started by Sinatra (i.e. with ruby asynchtest.rb)
This means that your assumptions are correct; when reaching sleep 2 , the server is able to serve another request at the same time , but on another thread.
I would to show this behavior with a simple test:
#asynchtest.rb
require 'sinatra'
require 'thin'
set :server, %w[thin]
get '/test' do
puts "[#{Time.now.strftime("%H:%M:%S")}] logging /test starts on thread_id:#{Thread.current.object_id} \n"
sleep 10
"[#{Time.now.strftime("%H:%M:%S")}] success - id:#{Thread.current.object_id} \n"
end
let's test it by starting three concurrent http requests ( in here timestamp and thread-id are relevant parts to observe):
The test demonstrate that we got three different thread ( one for each cuncurrent request ), namely:
70098572502680
70098572602260
70098572485180
each of them starts concurrently ( the starts is pretty immediate as we can see from the execution of the puts statement ) , then waits (sleeps) ten seconds and after that time flush the response to the client (to the curl process).
deeper understanding
Quoting wikipedia - Asynchronous_I/O:
In computer science, asynchronous I/O, or non-blocking I/O is a form of input/output processing that permits
other processing to continue before the transmission has finished .
The above test (Sinatra/thin) actually demonstrate that it's possible to start a first request from curl ( the client ) to thin ( the server)
and, before we get the response from the first (before the transmission has finished) it's possible to start a second and a third
request and these lasts requests aren't queued but starts concurrently the first one or in other words: permits other processing to continue*
Basically this is a confirmation of the #Holger just's comment: sleep blocks the current thread, but not the whole process. That said, in thin, most stuff is handled in the main reactor thread which thus works similar to the one thread available in node.js: if you block it, nothing else scheduled in this thread will run. In thin/eventmachine, you can however defer stuff to other threads.
This linked answers have more details: "is-sinatra-multi-threaded and Single thread still handles concurrency request?
Node.js
To compare the behavoir of the two platform let's run an equivalent asynchtest.js on node.js; as we do in asynchtest.rb to undertand what happen we add a log line when processing starts;
here the code of asynchtest.rb:
var express = require('express');
var app = express();
app.get('/test', function(req, res){
console.log("[" + getTime() + "] logging /test starts\n");
setTimeout(function(){
console.log("sleep doen't block, and now return");
res.send('[' + getTime() + '] success \n');
},10000);
});
var server = app.listen(3000,function(){
console.log("listening on port %d", server.address().port);
});
Let's starts three concurrent requests in nodejs and observe the same behavoir:
of course very similar to what we saw in the previous case.
This response doesn't claim to be exhaustive on the subject which is very complex and deserves further study and specific evidence before drawing conclusions for their own purposes.

There are lots of subtle differences, almost too many to list here.
First, don't confuse "coding style" with "event model". There's no reason you need to use callbacks in Node.js (see various 'promise' libraries). And Ruby has EventMachine if like the call-back structured code.
Second, Thin (and Ruby) can have many different multi-tasking models. You didn't specify which one.
In Ruby 1.8.7, "Thread" will create green threads. The language actually turns a "sleep N" into a timer call, and allows other statements to execute. But it's got a lot of limitations.
Ruby 1.9.x can create native OS threads. But those can be hard to use (spinning up 1000's is bad for performance, etc.)
Ruby 1.9.x has "Fibers" which are a much better abstraction, very similar to Node.
In any comparison, you also have to take into account the entire ecosystem: Pretty much any node.js code will work in a callback. It's really hard to write blocking code. But many Ruby libraries are not Thread-aware out of the box (require special configuration, etc). Many seemingly simple things (DNS) can block the entire ruby process.
You also need to consider the language. Node.JS, is built on JavaScript, which has a lot of dark corners to trip you up. For example, it's easy to assume that JavaScript has Integers, but it doesn't. Ruby has fewer dark corners (such as Metaprogramming).
If you are really into evented architectures, you should really consider Go. It has the best of all worlds: The evented architecture is built in (just like in Node, except it's multiprocessor-aware), there are no callbacks (just like in Ruby), plus it has first-class messaging (very similar to Erlang). As a bonus, it will use a fraction of the memory of a Node or Ruby process.

No, node.js is fully asynchronous, setTimeout will not block script execution, just delay part inside it. So this parts of code are not equal. Choosing platform for your project depends on tasks you want to reach.

How can I tell if my Ruby server script is being overloaded?

I have a daemonized ruby script running on my server that looks like this:
#server = TCPServer.open(61101)
loop do
#thr = Thread.new(#server.accept) do |sock|
Thread.current[:myArrayOfHashes] = [] # hashes containing attributes of myObject
SystemTimer.timeout_after(5) do
Thread.current[:string] = sock.gets
sock.close
# parse the string and load the data into myArrayOfHashes
Myobject.transaction do # Update the myObjects Table
Thread.current[:myArrayOfHashes].each do |h|
Thread.current[:newMyObject] = Myobject.new
# load up the new object with data
Thread.current[:newMyObject].save
end
end
end
end
#thr.join
end
This server receives and manages data for my rails application which is all running on Mac OS 10.6. The clients call the server every 15 minutes on the 15 and while I currently only have 16 or so clients calling every 15 min on the 15, I'm wondering about the following:
If two clients call at close enough to the same time, will one client's connection attempt fail?
How I can figure out how many client connections my server can accommodate at the same time?
How can I monitor how much memory my server is using?
Also, is there an article you can point me toward that discusses the best way to implement this kind of a server? I mean can I have multiple instances of the server listening on the same port? Would that even help?
I am using Bluepill to monitor my server daemons.

1 and 2
The answer is no, two clients connecting close to each other will not make the connection fail (however multiple clients connecting may fail, see below).
The reason is the operating system has a default so called listening queue built into all server sockets. So even if you are not calling accept fast enough in your program, the OS will still keep buffering incoming connections for you. It will buffer these connections for as long as the listening queue does not get filled.
Now what is the size of this queue then?
In most cases the default size typically used is 5. The size is set after you create the socket and you call listen on this socket (see man page for listen here).
For Ruby TCPSocket automatically calls listen for you, and if you look at the C-source code for TCPSocket you will find that it indeed sets the size to 5:
https://github.com/ruby/ruby/blob/trunk/ext/socket/ipsocket.c#L108
SOMAXCONN is defined as 5 here:
https://github.com/ruby/ruby/blob/trunk/ext/socket/mkconstants.rb#L693
Now what happens if you don't call accept fast enough and the queue gets filled?
The answer is found in the man page of listen:
The backlog argument defines the maximum length to which the queue of pending connections for sockfd may grow. If a connection request arrives when the queue is full, the client may receive an error with an indication of ECONNREFUSED or, if the underlying protocol supports retransmission, the request may be ignored so that a later reattempt at connection succeeds.
In your code however there is one problem which can make the queue fill up if more than 5 clients try to connect at the same time: you're calling #thr.join at the end of the loop.
What effectively happens when you do this is that your server will not accept any new incoming connections until all your stuff inside your accept-thread has finished executing.
So if the database stuff and the other things you are doing inside the accept-thread takes a long time, the listening queue may fill up in the meantime. It depends on how long your processing takes, and how many clients could potentially be connecting at the exact same time.
3
You didn't say which platform you are running on, but on linux/osx the easiest way is to just run top in your console. For more advanced memory monitoring options you might want to check these out:
ruby/ruby on rails memory leak detection
track application memory usage on heroku

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio