Flask-SocketIO client connections failing at just over 1000 connections - socket.io

Ubuntu 20.04 on EC2 instance
Flask-SocketIO 5.1.1
python-engineio 4.2.1
python-socketio 5.4.0
simple-websocket 0.3.0
I'm doing some load testing on this server to maximise the number of client connections I can get out of it. The server python code includes the following main elements:
from flask import Flask, request
from flask_socketio import SocketIO, emit, join_room, leave_room, send
app = Flask(__name__)
app.config['SECRET_KEY'] = appConfig.SECRET_KEY
socketio = SocketIO(app, cors_allowed_origins="*")
#socketio.on('json')
def handle_json(json):
emit("json", json, broadcast=True, include_self=False)
if __name__ == '__main__':
socketio.run(
app,
host='0.0.0.0',
port=443,
keyfile='MYPATH/privkey.pem',
certfile='MYPATH/fullchain.pem',
max_size=100000
)
Note the max_size arg in socketio.run() passing through to the max_size eventlet param.
I have set the file descriptor limits with sudo nano /etc/security/limits.conf as follow:
* soft nproc 100000
* hard nproc 100000
* soft nofile 100000
* hard nofile 100000
I can confirm this with ulimit -a which gives output:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 1806
max locked memory (kbytes, -l) 65536
max memory size (kbytes, -m) unlimited
open files (-n) 100000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 100000
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
I am load testing with the following code (reduced for brevity) which creates 2000 connections running from my local machine:
import socketio
URL = "MY_SOCKETIO_SERVER_URL"
clients = []
for k in range(2000):
sio = socketio.Client()
sio.connect(URL)
clients.append(sio)
for client in clients:
client.disconnect()
When I run this, around 1176 connections are made successfully, and then they fail to be created after that point and I get a high CPU process in htop of process /usr/lib/snapd/snapd and the socket.io python app terminates printing the message "Killed" after a few seconds.
I need to support around 5000 connections minimum on this server. I understand that it is possible, but how do I do that please?

Related

npm run dev aborts on shared hosting sever

I'm trying to set up a git repository that contains a laravel project on a server that uses cpanel. After copy missing libraries and dependencias from both composer.json and package.json the project asks me to run npm run dev in order to create the mix manifest file. However, whenever I enter those commands this error keeps coming up:
> # dev /home3/regioye5/repositorios/region-admin
> npm run development
> # development /home3/regioye5/repositorios/region-admin
> mix
node[474]: ../src/node_platform.cc:61:std::unique_ptr<long unsigned int> node::WorkerThreadsTaskRunner::DelayedTaskScheduler::Start(): Assertion `(0) == (uv_thread_create(t.get(), start_thread, this))' failed.
1: 0xa04200 node::Abort() [node]
2: 0xa0427e [node]
3: 0xa7429e [node]
4: 0xa74366 node::NodePlatform::NodePlatform(int, v8::TracingController*) [node]
5: 0x9d1ae6 node::InitializeOncePerProcess(int, char**) [node]
6: 0x9d1d21 node::Start(int, char**) [node]
7: 0x7fbfeb70d555 __libc_start_main [/lib64/libc.so.6]
8: 0x9694cc [node]
Aborted
I've been looking for an answer on internet but nobody seems to have had this issue before. I rean in other posts that may be the job processes or something like that, I share to you the ulimited -a command result on server:
core file size (blocks, -c) 0
data seg size (kbytes, -d) 800000
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 178728
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) 800000
open files (-n) 100
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 15240
cpu time (seconds, -t) unlimited
max user processes (-u) 25
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
I visited laravel mix documentation and I found the following command. For me it worked like a charm!
node_modules/.bin/webpack --config=node_modules/laravel-mix/setup/webpack.config.js
It appears it goes straight to the library and asks it to config the file that contains the mix variables. Hope it works for you.

What the .out files in the hadoop logs folder? Is it safe to delete them?

I manage a small fully distributed hadoop cluster and I was doing my routine cleanup of logs and inspection. I see a bunch of files with the .out extension in the {HADOOP_HOME}/logs path that I configured. There are several such as:
hadoop-<my-system-name>-namenode-<my-system-name>.out
hadoop-<my-system-name>-namenode-<my-system-name>.out.1
hadoop-<my-system-name>-namenode-<my-system-name>.out.2
hadoop-<my-system-name>-datanode-<my-system-name>.out
hadoop-<my-system-name>-historyserver-<my-system-name>.out
hadoop-<my-system-name>-historyserver-<my-system-name>.out.2
hadoop-<my-system-name>-historyserver-<my-system-name>.out.3
hadoop-<my-system-name>-resourcemanager-<my-system-name>.out
hadoop-<my-system-name>-resourcemanager-<my-system-name>.out.1
hadoop-<my-system-name>-secondarynamenode-<my-system-name>.out
hadoop-<my-system-name>-secondarynamenode-<my-system-name>.out.1
hadoop-<my-system-name>-secondarynamenode-<my-system-name>.out.2
etc. etc. etc.
When I look at one of them with an editor, such as the hadoop-<my-system-name>-namenode-<my-system-name>.out.1 file, I get:
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 514997
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 16384
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 8092
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
What are these files? Do they serve a purpose to keep or can they be deleted?
Like all good applications, logs serve a great purpose - finding out what is happening with your service. You should probably be putting the logs into something like Elasticsearch/Solr/Graylog/etc to search/alert on them
Anything that ends in a number can be safely deleted.
They are managed by the log4j.properties RollingFileAppender that is started with Hadoop.

MongoDB insertion failure "error inserting documents: new file allocation failure"

I used a bash script to do the insertion:
for i in *.json
do
mongoimport --db testdb --collection test --type json --file $i --jsonArray
done
Now my database testdb is 5.951GB and the terminal keeps giving me the error
error inserting documents: new file allocation failure
How much data can I hold in one collection? What is the best way for me to handle this? I currently have 20GB worth of data but I will have another 40GB data to be added.
-UPDATE-
Here's my ulimit status:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 31681
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 4096
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 31681
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Mongo can handle billions of documents in one collection but the maximum document size is 16m.
When you create a collection you can set his size:
db.createCollection( "collection-name", { capped: true, size: 100000 } )
Mongo provide a bulk api if you have to insert multiples document in a collection: bulk-write-operations

hadoop ulimit open files name

I have a hadoop cluster we assuming is performing pretty "bad". The nodes are pretty beefy.. 24 cores, 60+G RAM ..etc. And we are wondering if there are some basic linux/hadoop default configuration that prevent hadoop from fully utilizing our hardware.
There is a post here that described a few possibilities that I think might be true.
I tried logging in the namenode as root, hdfs and also myself and trying to see the output of lsof and also the setting of ulimit. Here are the output, can anyone help me understand why the setting doesn't match with the open files number.
For example, when I logged in as root. The lsof looks like this:
[root#box ~]# lsof | awk '{print $3}' | sort | uniq -c | sort -nr
7256 cloudera-scm
3910 root
2173 oracle
1886 hbase
1575 hue
1180 hive
801 mapred
470 oozie
427 yarn
418 hdfs
244 oragrid
241 zookeeper
94 postfix
87 httpfs
...
But when I check out the ulimit output, it looks like this:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 806018
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
I am assuming, there should be no more than 1024 files opened by one user, however, when you look at the output of lsof, there are 7000+ files opened by one user, can anyone help explain what is going on here?
Correct me if I had made any mistake understanding the relation between ulimit and lsof.
Many thanks!
You need to check limits for the process. It may be different from your shell session:
Ex:
[root#ADWEB_HAPROXY3 ~]# cat /proc/$(pidof haproxy)/limits | grep open
Max open files 65536 65536 files
[root#ADWEB_HAPROXY3 ~]# ulimit -n
4096
In my case haproxy has a directive on its config file to change maximum open files, there should be something for hadoop as well
I had a very similar issue, which caused one of the claster's YARN TimeLine server to stop due to reaching magical 1024 files limit and crashing with "too many open files" errors.
After some investigation it came out that it had some serious issues with dealing with too many files in TimeLine's LevelDB. For some reason YARN ignored yarn.timeline-service.entity-group-fs-store.retain-seconds setting (by default it's set to 7 days, 604800ms). We had LevelDB files dating back for over a month.
What seriously helped was applying a fix described in here: https://community.hortonworks.com/articles/48735/application-timeline-server-manage-the-size-of-the.html
Basically, there are a couple of options I tried:
Shrink TTL (time to live) settings First enable TTL:
<property>
<description>Enable age off of timeline store data.</description>
<name>yarn.timeline-service.ttl-enable</name>
<value>true</value>
</property>
Then set yarn.timeline-service.ttl-ms (set it to some low settings for a period of time):
\
<property>
<description>Time to live for timeline store data in milliseconds.</description>
<name>yarn.timeline-service.ttl-ms</name>
<value>604800000</value>
</property>
Second option, as described, is to stop TimeLine server, delete the whole LevelDB and restart the server. This will start the ATS database from scratch. Works fine if you failed with any other options.
To do it, find the database location from yarn.timeline-service.leveldb-timeline-store.path, back it up and remove all subfolders from it. This operation will require root access to the server where TimeLine is located.
Hope it helps.

Ruby profiler stack level too deep error

It seems like I always get this error on one of my scripts:
/Users/amosng/.rvm/gems/ruby-1.9.3-p194/gems/ruby-prof-0.11.2/lib/ruby-prof/profile.rb:25: stack level too deep (SystemStackError)
Has anyone encountered this error before? What could be causing it, and what can I be doing to prevent it from happening?
I run my ruby-prof scripts using the command
ruby-prof --printer=graph --file=profile.txt scraper.rb -- "fall 2012"
Edit I'm on Mac OS X, if that matters. Doing ulimit -s 64000 doesn't seem to help much, unfortunately. Here is what ulimit -a gives:
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
file size (blocks, -f) unlimited
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 256
pipe size (512 bytes, -p) 1
stack size (kbytes, -s) 64000
cpu time (seconds, -t) unlimited
max user processes (-u) 709
virtual memory (kbytes, -v) unlimited
Edit 2
Andrew Grimm's solution worked just fine to prevent ruby-prof from crashing, but the profiler seems to have problems of its own, because I see percentages like 679.50% of total time taken for a process...
One workaround would be to turn tail call optimization on.
The following is an example of something that works with TCO on, but doesn't work when TCO is off.
RubyVM::InstructionSequence.compile_option = {
:tailcall_optimization => true,
:trace_instruction => false
}
def countUpTo(current, final)
puts current
return nil if current == final
countUpTo(current+1, final)
end
countUpTo(1, 10_000)
Stack level too deep usually means an infinite loop. If you look at the ruby-prof code where the error happens you will see that it's a method that detects recursion in the call stack.
Try looking into the code where you are using recursion (how many places in your code can you be using recursion?) and see if there is a condition that would cause it to never bottom-out?
It could also mean that your system stack just isn't big enough to handle what you are trying to do. Maybe you are processing a large data set recursively? You can check your stack size (unixy systems):
$ ulimit -a
and increase the stack size:
$ ulimit -s 16384
You can also consider adjusting your algorithm. See this stack overflow quesion
I hope I'm not just re-hashing an existing question...
Having percentages go over 100% in Ruby-prof has been a known bug, but should be fixed now.

Resources