Just cross-posting this from github.
I am using xorm 0.4.3 with go-mysql. We are on Golang 1.4.
We have specified maxIdleConnetions and maxOpenConnections in xorm as below:-
var orm *xorm.Engine
...
orm.SetMaxOpenConns(50)
orm.SetMaxIdleConns(5)
And we are using the same single xorm instance to query Mysql.
But still we are seeing lot of connections in TCP Connection Establised state which is way over the numbers I have configured in maxIdleConnetions and maxOpenConnections state when we lsof:-
app 8747 10568 sandeshsharma 16u IPv4 691032 0t0 TCP 127.0.0.1:57337->127.0.0.1:mysql (ESTABLISHED)
We have also observed that even if we stop the MySQL, the connection numbers still remain fixed but in the CLOSED_WAIT state. If we shutdown the app then all connections go away.
app 8747 10844 sandeshsharma 38u IPv4 505058 0t0 TCP 127.0.0.1:54160->127.0.0.1:mysql (CLOSE_WAIT)
However in mysql process list it is showing the correct number of connections as I have specified in maxIdleConnetions and maxOpenConnections.
Can some one please explain me this behaviour? Why are we observing so much TCP connections even though we have specified maxIdleConnetions and maxOpenConnections to 5 & 50 respectively?
First of all, Go 1.4 is too old. Use latest Go 1.6.
This answer is written with knowledge of Go 1.6. So some details may be different in your case.
There are four state in connection: connecting, idle, inuse, and closing.
MaxOpenConnections limits number of connection in connecting, idle, inuse state.
So, if your application closes and reopen connection quickly, it can happen.
Since TCP is CLOSED_WAIT state in MySQL server side, your app is waiting EOF from connection. I suppose your app is under very high load
and slow at reading EOF and closing connection. Until read EOF and close connection, TCP state is ESTABLISHED on client side, regardless TCP state in server side.
I recommend you to update Go and "go-sql-driver/mysql", and set MaxIdleConns equal to MaxOpenConns to avoid high reconnect rate.
Instead, you can use SetConnMaxLifetime (new API in Go 1.6) to close
connections when your app is idle.
Related
Why does Apache-Commons-Net's FTPClient sometimes make the wrong computation for the port number in the PORT command? This is in active mode. For example FTPClient it could send out
PORT <some>,<ip>,<address>,<here>,235,181 when in fact the port number used is 60340. What's the cause for this wrong computation?
This could happen on version 3.3.
I know ftpClient.enterLocalPassiveMode(); could solve this, but I want to know the part where the active mode doesn't work as expected.
From your comments, I assume you mistake an FTP control connection with a data connection.
I assume that the 60340 is local port of the FTP control connection. When opening data connection, 60341 is assigned (hence the PORT ...,235,181).
Reasoning: In an FTP active mode, the client opens listening port for the expected data connection, which it then reports to the server via PORT command over an existing control connection. If the server cannot connect to the port, no TCP/IP packet can ever come to that port. As you claim that the "two machines still communicate at port 60340", it must be the control connection. There cannot be any communication on port, if the connection failed ("Can't open data connection").
The actual cause of the "Can't open data connection" error is likely that you are behind a firewall, so the server cannot connect back to the client. What is a common nowadays. That's what passive mode is good for.
Ive searched internet but didnt got the answer can any1 explain me the difference between them
A TCP "connection" is a 4-tuple. Local IP, Local Port, Remote IP, and Remote Port. Each end maintains this identification within its TCP stack, with the senses reversed (Local vs. Remote).
The combination of these 4 values must be unique. This explains the problems people often have writing a TCP client that reuses a socket to reconnect to the same server.
A "closed" connection leaves this ID in the tables at each end for some time, in TIME_WAIT state. This is an artifact of a TCP mechansim that deals with maintaining connection integrity even if the physical layer connection breaks, keeps pending packets from being recevied by a second connection, etc. TIME_WAIT can last up to 4 minutes.
Unless the client resets its socket's LocalPort to 0 (which is a request for an automatic ephemeral port assignment) it can fail if it tries to reconnect before TIME_WAIT expires. Since this is 0 for a newly created socket, programmers often overlook this requirement prior to calling Connect.
LocalPort isn't just an issue for listening sockets.
A server listens on a localport, while a client sends data from the localport.
The client remoteport should be the same as the server localport.
i.e.:
Server listens on port n (local port relative to server)
Client connects to server on port n (remote port relative to client)
To answer your question, the difference is in name, based on perspective.
This seems to be a good place to start with VB6 socket communication
I had some networking issues on my Windows server, and find out (by using NETSTAT) that I have more than 90,000 (!) connections in TIME_WAIT which didn't closed.
I've changed the TcpTimedWaitDelay param in registry, but apparently a server restart is needed.
Because it's a single production DB server, I can't afford it at the moment.
Is there any way killing a TIME_WAIT connection?
Any other suggestions?
Thanks!
Roei
I have more than 90,000 (!) connections in TIME_WAIT which didn't closed.
No you don't. These represent connections which have already closed, and whose local port is hanging around for TCP security reasons. They will only be that way for a couple of minutes each. Just wait.
When the Oracle 10 databases are up and running fine, OCILogon2() will connect immediately. When the databases are turned off or inaccessible due to network issues - it will fail immediately.
However when our DBAs go into emergency maintenance and block incomming connections, it can take 5 to 10 minutes to timeout.
This is problematic for me since I've found that OCILogin2 isn't thread safe and we can only use it serially - and I connect to quite a few Oracle DBs. 3 blocked servers X 5-10 minutes = 15 to 30 minutes of lockup time
Does anyone know how to set the OCILogon2 connection timeout?
Thanks.
I'm currenty playing with OCI and it seems to me that it's impossible.
The only way I can think of is to use non-blocking mode. You'll need OCIServerAttach() and OCISessionBegin() instead of OCILogon() in this case. But when I tried this, OCISessionBegin() constantly returns OCI_ERROR with the following error code:
ORA-03123 operation would block
Cause: The attempted operation cannot complete now.
Action: Retry the operation later.
It looks strange and I don't yet know how to deal with it.
Possible workaround is to run your logon in another process, which you can kill after timeout...
We think we found the right file setting - but it's one of those problems where we have to wait until something rare and horrible occurs before we can verify it :-/
[sqlnet.ora]
SQLNET.OUTBOUND_CONNECT_TIMEOUT=60
From the Oracle docs..
http://download.oracle.com/docs/cd/B28359_01/network.111/b28317/sqlnet.htm#BIIFGFHI
5.2.35 SQLNET.OUTBOUND_ CONNECT _TIMEOUT
Purpose
Use the SQLNET.OUTBOUND_ CONNECT _TIMEOUT parameter to specify the time, in seconds, for a client to establish an Oracle Net connection to the database instance.
If an Oracle Net connection is not established in the time specified, the connect attempt is terminated. The client receives an ORA-12170: TNS:Connect timeout occurred error.
The outbound connect timeout interval is a superset of the TCP connect timeout interval, which specifies a limit on the time taken to establish a TCP connection. Additionally, the outbound connect timeout interval includes the time taken to be connected to an Oracle instance providing the requested service.
Without this parameter, a client connection request to the database server may block for the default TCP connect timeout duration (approximately 8 minutes on Linux) when the database server host system is unreachable.
The outbound connect timeout interval is only applicable for TCP, TCP with SSL, and IPC transport connections.
Default
None
Example
SQLNET.OUTBOUND_ CONNECT _TIMEOUT=10
Why is it that TCP connections to a loopback interface end up in TIME_WAIT (socket closed with SO_DONTLINGER set), but identical connections to a different host do not end up in TIME_WAIT (they are reset/destroyed immediately)?
Here are scenarios to illustrate:
(A) Two applications, a client and a server, are both running on the same Windows machine. The client connects to the server via the server's loopback interface (127.0.0.1, port xxxx), sends data, receives data, and closes the socket (SO_DONTLINGER is set).
Let's say that the connections are very short-lived, so the client app is establishing and destroying a large number of connections each second. The end result is that the sockets end up in TIME_WAIT, and the client eventually exhausts its max number of sockets (on Windows, this is ~3900 by default, and we are assuming that this value will not be changed in the registry).
(B) Same two applications as scenario (A), but the server is on a different host (the client is still running on Windows). The connections are identical in every way, except that they are not destined for 127.0.0.1, but some other IP instead. Here the connections on the client machine do NOT go into TIME_WAIT, and the client app can continue to make connections indefinitely.
Why the discrepancy?
The TIME_WAIT state only occurs at one end of the connection -- the end that closes first. For the loopback interface both ends are on the same machine so you will always see TIME_WAIT.
In your other case, try looking at the other machine. I think you'll see the TIME_WAIT sockets there.