I need to have as quick timeout as I get (connection failed) on windows. but on solaris its much longer, how can I make it shorten? (I'm trying to connect on purpose to a machine that does not exist to simulate a machine is down).
When I'm performing this on windows --> timeout --> good
D:>telnet 192.168.23.21 222
Connecting To 192.168.23.21...Could not open connection to the host, on port 23:
Connect failed
D:>
on windows (the target ip does not exist)
then in about 15 seconds the command terminates.
However when I perform this from a solaris --> very long timeout --> not good for my legacy code machine like this:
myuser#mycomp:~$ telnet 192.168.23.21 222
Trying 192.168.23.21...
Then the process does not terminate
and this has major implications for me because i'm migrating an app from windows to solaris, and I must be able to have this timeout (in legacy code which I cannot update), so I need at the OS level to control this timeout to be as short as is currently in windows. How can I change this timeout in my solaris OS then? to be short, just as I havbe it on windows
Thanks
If you absolutely have to do this systemwide, there's a TCP driver parameter tcp_ip_abort_cinterval that can be modified:
tcp_ip_abort_cinterval - This is the
amount of time that a connection is
allowed to stay in a half open state. This is 180,000
(3 minutes) by default. You can change this to 25,000
if you want (25 seconds). Please note that by changing this
you may find that SLIP/PPP users may have problems conacting
your site.
To view your current setting:
/usr/sbin/ndd /dev/tcp
tcp_ip_abort_cinterval
To change the setting:
/usr/sbin/ndd -set /dev/tcp
tcp_ip_abort_cinterval 25000
Perhaps you could set the socket option SO_SNDTIMEO -- that link reports Solaris doesn't respect that option, but you might be lucky and they've fixed it by now. :)
If the socket option doesn't work, you could always set an alarm(2) for some point in the future and interrupt your connect(2) call. It feels pretty gross, but it is an option.
Another option is to switch to non-blocking socket operations and poll at some point in the future if the connect(2) operation succeeded or not. You could see a timeout to select(2) and discover if it has errored or is readable/writable. (See also the EINPROGRESS bit in connect(2).)
Related
We recently observe rare UDP communication issues that show the following symptoms:
A socket sendto() call fails with error WSAENOBUFS (10055)
A subsequent recvfrom() call on this socket does not receive anything, even though Wireshark shows that the network interface actually received the expected datagrams. This situation persists for approximately 8 seconds, afterwards new incoming datagrams can be received again from the socket.
In Windows System Log, there appears a Kernel-General information entry at the time of the sendto() error:
The access history in hive \??\C:\ProgramData\Microsoft\Provisioning\Microsoft-Desktop-Provisioning-Sequence.dat was cleared updating 0 keys and creating 0 modified pages.
The issue happens on a customer system running Microsoft Windows 10 Pro for Workstations, Version 10.0.17763 Build 17763.
On that system we were able to reproduce the issue with a simple test program written in C++ that echoes UDP datagrams. We verified that the thread receiving from the socket was actually responsive all the time, by specifying a timeout of 1 second using SO_RCVTIMEO, printing some “still alive” output and immediately calling recvfrom() again.
On our own test system, we were unable to observe the issue under the same circumstances as the customer. However, we were able to provoke similar effects when playing around with the network adapter settings while the test was running. Enabling Microsoft LLDP Protocol Driver showed the sendto() error and sometimes also resulted in the 8 second “silence” period, but without any Windows System Log entry.
Any hints are greatly appreciated.
The issue seems to be related to Microsoft Provisioning Tool since Windows 10 1809.
Disabling it fixed the issue in our case:
Open Task Scheduler, go to Microsoft/Windows/Manangement/Provisioning and disable Logon task.
Source: Windows TenForums
If I start an SSH connection with my windows 10 laptop, it gets aborted within a minute. Even when I'm actively using the connection.
I tried multiple servers (Ubuntu 18, 16 and ESXi 6.7) all with the same problem, also tried to use different clients (putty and mobaXterm).
Did a packet capture and it does look like the connected server sends a RST with ACK to my laptop. After which my laptop responds FIN with ACK.
If I setup the same connection from my phone with JuiceSSH it keeps working normally. That's why I suspect my laptop, but I have no idea how to resolve it.
Any ideas?
In putty feel free to try with:
In your session properties, go to Connection and under Sending of null packets
to keep session active, set Seconds between keepalives (0 to turn off) to e.g.
300 (5 minutes).
source: https://patrickmn.com/aside/how-to-keep-alive-ssh-sessions/
The timing section of the Firefox Network Monitor documentation, "Blocked" is explained as:
Time spent in a queue waiting for a network connection.
The browser imposes a limit on the number of simultaneous connections that can be made to a single server. In Firefox this defaults to 6
Is the limit on the number connections the only limitation? Or is the browser blocked waiting to get a connection from the OS count as blocked too?
In a fresh browser, on a first connection, before any other connection is made (so the limit should not apply here), I get blocked for 195 ms.
Is this the browser waiting for the OS? Was does "Blocked" mean here?
We changed the Firefox setting (about:config) 'network.http.max-persistent-connections-per-server' to 64 and the blocks went away. We changed it back to 6. We changed our design/development method to a more 'asynchronous' loading method so as not to have a large number simultaneous connections. The blocks were mostly loading a lot of png flags for locale settings.
I have a server that takes several seconds to respond, which allowed me to cross-reference the firefox measurement with a wireshark trace. I see that the first SYN is sent out immediately. The end of the "Blocked" time corresponds to when the Server Hello comes back.
I couldn't relate the end of "TLS setup" to any wireshark packet. It extends a few seconds belong the last data that is exchanged on the initial TLS connection.
Bottom line: it doesn't look like the time spent in "Blocked" and "TLS setup" is very reliable, at least in some cases.
My setup has a TLS reverse proxy that forwards the connection with SNI. I'm not sure if that might be related.
Time spent in a queue waiting for a network connection.
The browser imposes a limit on the number of simultaneous connections
that can be made to a single server. In Firefox this defaults to 6,
but can be changed using the
network.http.max-persistent-connections-per-server preference. If all
connections are in use, the browser can't download more resources
until a connection is released.
Source : https://developer.mozilla.org/en-US/docs/Tools/Network_Monitor
It's very clear that the browser fixes the limit to 6 concurrent connections per server (domains/IP), the OS question is not very relevent.
In my case both waiting for network connection and DNS lookup times were pretty high, up to 2 seconds each, caused significant page load times if the page was loaded for the first time. Firefox was freshly installed without addons and just started with no other opened tabs. I tried on both Ubuntu 18.04 LTS and Ubuntu 19.04 with the same results. Although my ISP doesn't provide support, my router assignes IPv6 addresses. As it turned out the problem was the IPv6 broken network, which forced Firefox to fall back to IPv4 (of course after some time(time-out)). After I turned off the IPv6 support in Linux the requests speeded up significantly.
Here is a relavant discussion: https://bugzilla.mozilla.org/show_bug.cgi?id=1452028
I encountered this error whilst using an Angular 9 'dist' deployment. I discovered that the error appeared because I was trying to access an unreachable API, according to the specified IP address and port.
Therefore to solve it, I just have to reference a valid and accessible API.
What does it mean when the terminal throw this error and how to solve it?
packet_write_wait: Connection to xxx.xxx.xxx.xxx: Broken pipe
It was just happen today. After it work normally for year.
My terminal keep disconnect at a certain time. I had already search on google but most of it is about "Write failed: Broken pipe."
Which I already solved that for years. I just found this new annoyed problems today
I experienced this problem as well and spent a few days trying to bisect it.
Like specified, playing with SSH KeepAlive parameters (ClientAliveInterval, ClientAliveCountMax, ServerAliveInterval and ServerAliveCountMax) or kernel TCP parameters (TCPKeepAlive on/off) does not solve the problem.
After playing with USB to Ethernet drivers and tcpdump, I realized the issue was due to the kernel 4.8 I was using. I switched the source (sending side) to 4.4 LTS and the problem disappeared (rsync via ssh and scp were working nicely again). The destination side can remain on 4.8 if you want, in my use case this was working (tested).
On the technical side, we can narrow a little bit the issue thanks to the wireshark dump below I made. We can see the TCP channel of the SSHv2 protocol is being reset (RST flag of TCP set to 1) causing the connection to abort. I don't know the cause of that RST yet. I need to make some bisection from 4.8.1 to 4.8.11 for that.
I'm not saying your problem is specifically due to the kernel 4.8, but wrt. the date you posted your question/message, there are high chances you are currently using a kernel more recent than 4.4.
If that is an ssh connection, then you might want to make sure you send a keepalive message to the server.
ServerAliveInterval seems to be the most common strategy to keep a connection alive. To prevent the broken pipe problem, here is the ssh config I useed in my .ssh/ssh_config file (may be named as /etc/ssh/config or sshd_config):
Host myhostshortcut
HostName myhost.com
User barthelemy
ServerAliveInterval 60
ServerAliveCountMax 10
Connect through another wifi.
I don't know why or how it works, but it does.
The original poster sthapaun already mentioned this solution in a comment, but I want to add that the solution works for me, too.
I am in need of a way to decrease the timeout my X server has on remote applications. Currently X11 will keep an application on the display for a very long time (> 30min) after removing the Ethernet connection. I am needing to timeout within 10-30 seconds of loss of communication with the application.
I am running a standard Xorg server with no modification made to it. I have tried numerous methods for doing this. I have tried using the -to option on the X server but this does not seem to have any effect. I have also tried messing with the TCP properties using sysctl. I have set the tcp_keepalive_* properties to values which should give me the timeout needed but this also does not seem to have an effect on the timeout.
Also, the remote applications are not using SSH tunneling to connect to the server. It is an open sever on a secure connection so tunneling is not needed. The timeout mechanism must be done on the server side as I have no control over the applications.
Anyone have any ideas how to get the needed behavior from the X server?
The X server doesn't have client timeouts. Anything you see that looks like one is TCP's doing, not X's.
If you're lucky, the application you're talking to responds to the _NET_WM_PING protocol (most modern toolkits do this for you internally). If you can at least control the window manager you're using, you could modify it to send ping messages to all your running apps and blow them away with XKillClient if they don't respond promptly.