Short version of question is: How to tune\configure macOS (Mojave 10.14.3) settings to allow more then 10k outgoing TCP connections per process and more then 16k connections in total.
Details:
I'm trying to make MacBookPro (16Gb RAM, Core i7) usable for stress-testing tcp server. Server itself hosted on separate pc, so right now the question is about outgoing connections only.
Below advices already processed and helped me significantly increase initial os limits.
1) I used [launchctl] ("Too many open files" when executing gatling on Mac) to increase maxfiles limit to 1 million.
2) I used sysctl to set\check kern.maxfiles limits. Actually (as I understand) this is the same as #1.
3) I played with ulimit. Actually I didn't notice any effect of this tool on my OS. But any way...
So now I MacOS can establish ~10k connections per process and 16k total connections in the system.
For simplicity my tool just open TCP connections in a infinite loop and waits.
try
{
while (true)
{
CreateAndConnectSocket(); //add socket to list
++connectedSockets;
}
}
catch(Exception e)
{
LogWrite("Connected sockets:" + connectedSockets);
LogWrite(e);
WaitForAnyKey();
}
Then I follow below steps.
1) Launch server on separate PC.
2) Open two terminals on mac.
3) Execute in first terminal window:
$ sudo launchctl limit maxfiles 1048576 1048600
$ ulimit -S -n 1048576
4) Verify that changes applied in first terminal:
$ ulimit -S -n
1048576
$ launchctl limit maxfiles
maxfiles 1048576 1048600
$ sysctl kern.maxfilesperproc
kern.maxfilesperproc: 1048576
$ sysctl kern.maxfiles
kern.maxfiles: 1048600
5) Launch "ulimit -S -n 1048576" in second terminal (Not sure that ulimit is required at all.)
6) Verify that all changes applied in second terminal window (same as #4).
7) Launch "test client" in first terminal.
8) Launch "test client" in second terminal.
Result:
After step 7 in first terminal I can see, that tool opened 10k connections (10202 to be precise) and fell down with exception "Too many open files in system". Have no idea why opened files is an issue with 1 million limit.
After step 8 in second terminal I can see that tool opened 6k connectoins and fell down with exception "Can't assign requested address".
While sockets remain opened (tools wait for key press), no other connections can be created in the system - browsers can't establish connections to google.com, etc.
And ofcourse tcp server remains accessible from another PCs.
Since I was able to tune "Windows 10 Home" for higher connection numbers, I believe that MacOS can be tuned too.
16383 TCP connections (from the same IP to the same port) is the limit imposed by default in MacOs (at least in Mojave).
This limit is defined by the ephemeral port range:
$ sudo sysctl net.inet.ip.portrange
net.inet.ip.portrange.lowfirst: 1023
net.inet.ip.portrange.lowlast: 600
net.inet.ip.portrange.first: 49152
net.inet.ip.portrange.last: 65535
net.inet.ip.portrange.hifirst: 49152
net.inet.ip.portrange.hilast: 65535
By default the range starts from 49152 (net.inet.ip.portrange.first) and ends to 65535 (net.inet.ip.portrange.last). That is, 65535 - 49152 = 16383.
You can make the ephemeral port range starting from 32768:
sudo sysctl -w net.inet.ip.portrange.first=32768
This way you double the available ephemeral ports (65535 - 32768 = 32767).
Related
I am using a GPSD to feed GPS information to a virtual serial port. I'm generating the virtual serial port with socat, and I am listening to the virtual port using: sudo cat /dev/pts/2, where /dev/pts/2 is the drive created by socat. The GPS signal is being obtained in a C++ script . The C++ script is giving me the expected output every 1 second, but the information stream simply stops after 30 seconds.
What options can I consider in either the socat arguments or the GPSDO arguments in my C++ script, to lengthen the time past 30 seconds?
Socat in default setup has no timeout as long as both connections stay open. Apply options -d -d -d -d -lu to Socat to see in its output what happens!
I have Socat command as follow :
socat -u TCP4-LISTEN:5000,reuseaddr,fork OPEN:/tmp/test1-2039-sip-i,creat,append
And I would like to modify to listen to many ports range, from port 10000 till 29999
What is the right command to fullfill that need?
yes, socat can only listen on one port at a time per instance so my amateur method for getting around this is is using an array in a bash script to open an instance of socat for each port I need to monitor. Doing this for thousands of ports isn't really practical, since while socat uses little resources when listening, running 20,000 instances of socat is impractical, but i've run 50 at a single time on a small SoC board. So name your defined ports in an array (ports that are actually used), then iterate through a bash array loop and spawn a few instances.
#!/bin/bash
ports=( 23 24 25 443 80 )
for port in "${ports[#]}"; do
socat tcp-listen:$port,reuseaddr,fork open:/tmp/$port.txt,creat,append
done
exit
This writes data coming in from the ports to "port number file".txt in /tmp, ie. 24.txt. If you get an error, it's because something else has bonded to the port already. Use inotifywatch to alert you when a file gets written to.
This question has been out here a while and it keeps coming up in my searches. So I decided to answer it.
I have both TCPView and Tcpvcon on my Windows 10 machine and I wonder how to get all the information (port numbers, etc.) displayed in TCPView in the output of the Tcpvcon program? TCPView has the process name, PID, protocol, remote address, remote port, etc. in its output to the GUI. Tcpvcon, on the other hand, only contains the process name, protocol, remote and local address. I would like to have all information that can be read in the TCPView GUI in the command line output of Tcpvcon (especially the port numbers). Tcpvcon seems to have only the three switches -a -c -n but no matter how I combine them, I do not reach my goal. Can anyone help me?
Below is a sample output when I use all three switches. In TCPView I see much more information about the specified process.
I was also very surprised that tcpvcon does not show port numbers (maybe we should ask Mark R. to add them ;-)
BUT you could use
netstat -a -o -n
or with an admin shell even
netstat -a -o -n -b
switches meaning:
-a ... Displays all active TCP connections and the TCP and UDP ports
on which the computer is listening.
-o ... Displays active TCP connections and includes the process ID (PID)
for each connection.
-n ... Displays active TCP connections, however, addresses and port numbers
are expressed numerically and no attempt is made to determine names.
-b ... Displays the executable involved in creating each connection or
listening port. (Note that this option can be time-consuming and
will fail unless you have sufficient permissions.)
To get all available switches just use netstat -? (there are other interesting ones) or https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/netstat
swobi
Up until the 2011 release of TCPVCON, it used to show port info.
The newer versions don't any more.
If you could get your hand on version 2.54, you would be able to get port info.
Tested with tcpvcon-v2.34 (I couldn't find 2.54) and it shows the ports but it doesn't show the process, all conections appear as from System. Also TCPV6 and UDPV6 are missing.
This is an example:
C:\WINDOWS\system32>"C:\My Program Files\TCPView-v4.13\tcpvcon-v2.34.exe" -a -c
TCP,System,-1,LISTENING,WXP-OR7507156:epmap,WXP-OR7507156:0
TCP,System,-1,LISTENING,WXP-OR7507156:microsoft-ds,WXP-OR7507156:0
TCP,System,-1,LISTENING,WXP-OR7507156:sms-rcinfo,WXP-OR7507156:0
TCP,System,-1,LISTENING,WXP-OR7507156:5040,WXP-OR7507156:0
TCP,System,-1,LISTENING,WXP-OR7507156:wsd,WXP-OR7507156:0
..
UDP,System,-1,,192.168.56.1:137,*:*
UDP,System,-1,,192.168.56.1:138,*:*
UDP,System,-1,,192.168.56.1:2177,*:*
UDP,System,-1,,192.168.56.1:5353,*:*
EDIT:
I correct myself. ASB was right.
I just got TCPView v2.54 and it does indeed show the application, the ports and also TCPV6 and UDPV6.
So I confirm that the "good" version is v2.54.
Tcpvcon.exe -a -c
TCPView v2.54 - TCP/UDP endpoint viewer
Copyright (C) 1998-2009 Mark Russinovich
Sysinternals - www.sysinternals.com
TCP,dnscrypt-proxy.exe,4188,LISTENING,WXP-XXX:domain,WXP-XXX:0
TCP,[System Process],0,TIME_WAIT,WXP-XXX:domain,localhost:62240
..
UDP,Teams.exe,12632,*,WXP-XXX:58950,*:*
TCPV6,svchost.exe,1232,LISTENING,wxp-XXX:135,wxp-XXX:0
..
UDPV6,svchost.exe,19712,*,wxp-XXX:50836,*:*
UDPV6,System,4,*,wxp-XXX:56736,*:*
To display the port numbers (and the process names) you need the old v2.54 version of tcpvcon.exe
This SysinternalsSuite.zip Archive from the Wayback Machine contains this version:
https://web.archive.org/web/20100201154325/http://download.sysinternals.com/Files/SysinternalsSuite.zip
all
The cluster system is constructed using the Perceus program. (scientific linuxs 6.9)
I installed condor in the vnfs file.
After this, when I make an ssh connection, I get a problem that ssh connection is disconnected after 10 minutes. The command is not recognized as shown below.
ssh was not disconnected before installing condor. However, we confirmed that pinging is done without loss.
how could fix this problem? Please suggest a solution
enter image description here
Can you go to this directory to edit ssh file
cd /etc/ssh/sshd_config
change
ClientAliveInterval 120
ClientAliveCountMax 720
This will make the server send the clients a “null packet” every 120 seconds and not disconnect them until the client have been inactive for 720 intervals (120 seconds * 720 = 86400 seconds = 24 hours).
For the past few years, I have been using an rsync one-liner to back up important folders on my Mac Mini desktop (OSX 10.9, 2.5 GHz i5, 4 GB RAM) to a FreeNAS box (0.7.2 Sabanda revision 5266, Pentium D 2.66 GHz, 822MiB RAM [reported by the system, I think there's 1 GB in there]). I am running an rsync daemon on the FreeNAS box. Recently, these transfers have been hanging indefinitely. I have done the usual Google-fu and am unable to identify the source of the problem or a solution.
The one-liner is:
rsync -rvOlt --exclude '.DS_Store' \
--exclude '.com.apple.timemachine.supported' \
--delete /Volumes/Storage/Music/Albums/ 192.168.1.100::albums
I have tried enabling -vvv and --progress, but there is no pattern that I can discern between what hangs and what doesn't. Heck, if I retry, the same file might hang at a different point during the transfer or not at all. A dry run (-n) does not always succeed either. The only "success" I've had is implementing a timeout (--timeout=10) and rerunning the command over and over. Eventually, I creep along, but with no guarantee of success and at a pace that is unacceptable. I've reached a point where I have one file that I can't get past.
The Mac Mini is connected to my router via 5 GHz. The FreeNAS box is wired into that same router on a 100 mbit port. When transfers are actually going, rsync --progress reports 2.5-4 MB/s. According to --progress, a hang is literally just that—no data transfer is occurring as far as I can tell.
I need help with both the diagnostics and the solution.
I was having the same problem. Removing -v didn't work for me. My use-case is slightly different in that I'm going from source (EXT4) to ExFAT. The issue for me was that rsync was attempting to preserve device files and permissions, which ExFAT doesn't support. I was using the -hrltDvaP switches. The -D and -a switches seemed to be my problem. The -a switch translates to -rlptgoD (no -H,-A,-X). The -p, -g, and -o switches seemed to be my root cause as rsync was barfing on one or all of those during runtime. Removing -a and specifying -Prltvc switches explicitly is working for me.
bkupcmd="nice -n$nicelevel /usr/bin/rsync -Prltvc --exclude-from=/var/tmp/ignorelist "
I've been running into the same thing again and again and it seems to help if you drop the -v option (which is annoying if you need that output).
Try using --whole-file/-W.
This command disables the rsync delta-transfer algorithm.
That is what worked for us (WSL to OSX)
our full sync flags were -avWPle
(e was because we were using ssh, and that has to be the last flag)
This happened to me when the remote device ran out of space. The error wouldn't show when --verbose option was used; turning that off yielded some STDERR output that explained that the remote device was out of space. When I freed some space, I was able to run rsync again with --verbose and everything went fine.
I am using openSUSE 13.2 Linux, rsync version 3.1.1-2.4.1.x86_64, and I experienced similar problems, doing an rsync between my laptop and an external hard disk, with the destination device definitively having enough free space.
I thought I got an improvement omitting option -v, but after 10 minutes it was hanging again: strace said:
select(5, [], [4], [], {60, 0}) = 0 (Timeout)
And with "iotop" I counld see confirm that the rsync processes did no significant disk IO any more.
Neither removing the -v option nor limiting the bandwidth using --bwlimit fixed the problem.
Just had a similar problem while doing rsync from harddisk to a FAT32 USB drive. rsync froze already in less than a second in my case and did not react at all after that ... left it with CTRL+C.
Found out that the problem was a combination of usage of hardlinks on the harddisk and having FAT32 filesystem on the USB drive, which does not support hardlinks.
Formatting the USB drive with ext4 solved the problem for me.
In my situation rsync was not actually failing.
I have regular server backups which transfers large files over 500GB+ and have --append-verify or --checkusm over ssh parameters specified.
What I have found upon analysis is that once the client side completes it's file checks then the server side checks start. Which means while the server is doing it's checks the client side will appear hanged and frozen - run htop on the server to rsync working away.
This is likely a non issue if rsync is run in deamon mode on the server and using the rsync protocol instead of ssh for transfers.
On related note, this very LONG wait would trigger SSH timeout and a rsync: connection unexpectedly closed (254 bytes received so far) [sender] error message, sollution is to add ClientAliveInterval 120 and ClientAliveCountMax 720 to /etc/ssh/sshd_config.
I've seen this quite often on 3.0.9 on a directory with hardlinks, but it also happened on 3.1.3.
There is a nice analysis in Debian bug 820916: when its internal sockets are congested with errors, rsync could go into a deadlock.
This might have been fixed in a 3.2 release just a few days ago (Jun 2020):
Avoid a hang when an overabundance of messages clogs up all the I/O buffers.
The only good workaround I can think of is, if the problem is not persistent, then put timeout in front of it: timeout rsync <args> <source> <destination>, then retry. If it is persistent for you, you're the lucky one who can debug it :D
It also happens when the user on target machine has not write permissions on target folder.
You can try giving write permission to others target folder:
sudo chmod -R o+w /path/to/target-folder
In my case, it was the IPC (Intrusion Protection Component) in our firewall. It sees all the TCP SYN packets as a flood attack and kills the connection. I left a rsync over NFS session open and turned off the IPC for the servers firewall rule and it starting working again right away.
rsync -ravh /source /destination
When it happened I was not able to kill the rsync session. It locked up the NFS mount and I would have to reboot the client machine to get it to work again. The strange thing is it would copy some files over then all of a sudden stop. It always seemed to stop on the same file. So I was looking for file issues, permission issues, TCP offloading issues, tried removing the -v in the rsync call. If you are having this issue at least in my case it even happened with a simple.
cp -rp /source /destination
So I knew then to start looking at other factors. So if you have any sort of intrusion protection on a firewall or router between the servers you can try turning that off temporarily to see if it solves your issue as well.
Most likely not "your" problem, but I stumbled upon this question when I was researching a similar behavior:
I'm observing "hanging" when the target site has too much io load. e.G. on one of my small business servers, when someone is resyncing his IMAP account and downloading large batchs of data and a backup job runs that writes his data.
In this situation I notice a steep drop in performance for rsync. Noticeable in a high load value in top on the target machine, even though CPU and Mem are fine.
Waiting for the process to finish has helped every time or interrupting and attempting the rsync at a later time again.
I was having the same problem and it was because I was running out of memory during the rsync. Created a swap file and problem solved.
Had rsync hanging issue on Ubuntu 16. None of the options above helped. The problem was in the source drive (external SSD) which suddenly became faulty. I tried several disk checks, but all of them stuck. Ended up rebooting the system and disk suddenly became accessible again.
Holger Ohmacht aka h8ohmh / 8ohmh:
The problem lies in the filesystem buffer / usage of the interworking of harddisk/hw so far as I could investigate.
Temporal solution for local drives (eg. USB3<->HD) : A script which is polling the changing disk space. If no changing free disk space then rsync is stalled and has to be restarted
cmd="rsync -aW --progress --stats --preallocate --super \
<here your source dir> \
<here your dest dir>"
eval "$cmd" &
rm ./ndf.txt
rm ./odf.txt
while [[ 0 == 0 ]]; do
df > ./ndf.txt
cmp ./odf.txt ./ndf.txt
res="$?"
echo "$res"
if [[ $res == 0 ]]; then
echo "###########################################"
ls -al "./ndf.txt"
ls -al "./odf.txt"
killall rsync
eval "$cmd" &
else
cp ./ndf.txt ./odf.txt
fi
sleep 60
done
Change <source dir> etc to your paths!
In my case it is always stalling by usage of rsync's --preallocate option (normally because of better disk performance and rescueing continuous blocks), so as long as the disk and filesystem drivers not reworked there just this solution