Open multiple keep-alives to a server in Go - go

I have an application that opens a keep-alive to a few remote servers (that I control). It sends a heart beat packet to keep this connection alive before the timeout.
This is how I created my transport:
// Keep-alive connection to the servers
tr := &http.Transport{}
client := &http.Client{Transport: tr}
If I use &http.Transport{MaxIdleConnsPerHost: 2} and set it to > 2, then I'm able to maintain multiple keep-alives per remote connection. However, these additional keep-alives per remote server are created by Go itself when there are concurrent requests that have to be made, and terminated automatically after the timeout expires.
My question is: how can I create the additional keep-alives, say 5 keep-alives per remote server myself when I initialize my transport (when I start Go) and keep them all alive? This would speed up subsequent requests greatly and speed is very important.

Based on inputs from the go-nuts group, to manually open multiple keep-alives to one server, we make that many simultaneous requests. Go then keeps these alive till the remote server times them out (default 5 seconds in Apache).
Please note that these connections cannot be more than the MaxIdleConnsPerHost which is 2 by default.
You can verify this behaviour using netstat -p tcp

Related

java.net.SocketException: Connection reset on reaching 3000 users in JMeteR

All required changes have been done to respective files like:
stalecheck=true,
keepalive is checked from HTTP request defaults,
retrycount=1,
hc.parameters file changes,
Socket timeout is 240000
Still we see "java.net.SocketException: Connection reset" in response data however I see the valid requests been passed to Server.
The issue wasnt till we reach 3000 users, worked smoothly till 3000 users.
Connection Reset has a lot of meaning, possible reasons are:
One of the server components is not able to handle load so it closes connections on its side
On JMeter side, check that you running in NON GUI mode and that neither JMeter JVM nor injector machine are overloaded which could explain this. See:
https://jmeter.apache.org/usermanual/get-started.html#non_gui

Problem with gRPC setup. Getting an intermittent RPC unavailable error

I have a grpc server and client that works as expected most of the time, but do get a "transport is closing" error occasionally:
rpc error: code = Unavailable desc = transport is closing
I'm wondering if it's a problem with my setup. The client is pretty basic
connection, err := grpc.Dial(address, grpc.WithInsecure(), grpc.WithBlock())
pb.NewAppClient(connection)
defer connection.Close()
and calls are made with a timeout like
ctx, cancel := context.WithTimeout(ctx, 300*time.Millisecond)
defer cancel()
client.MyGRPCMethod(ctx, params)
One other thing I'm doing is checking the connection to see if it's either open, idle or connecting, and reusing the connection if so. Otherwise, redialing.
Nothing special configuration is happening with the server
grpc.NewServer()
Are there any common mistakes setting up a grpc client/server that I might be making?
After much search, I have finally come to an acceptable and logical solution to this problem.
The root-cause is this: The underlying TCP connection is closed abruptly, but neither the gRPC Client nor Server are 'notified' of this event.
The challenge is at multiple levels:
Kernel's management of TCP sockets
Any intermediary load-balancers/reverse-proxies (by Cloud Providers or otherwise) and how they manage TCP sockets
Your application layer itself and it's networking requirements - whether it can reuse the same connection for future requests not
My solution turned out to be fairly simple:
server = grpc.NewServer(
grpc.KeepaliveParams(keepalive.ServerParameters{
MaxConnectionIdle: 5 * time.Minute, // <--- This fixes it!
}),
)
This ensures that the gRPC server closes the underlying TCP socket gracefully itself before any abrupt kills from the kernel or intermediary servers (AWS and Google Cloud Load Balancers both have larger timeouts than 5 minutes).
The added bonus you will find here is also that any places where you're using multiple connections, any leaks introduced by clients that forget to Close the connection will also not affect your server.
My $0.02: Don't blindly trust any organisation's (even Google's) ability to design and maintain API. This is a classic case of defaults-gone-wrong.
One other thing I'm doing is checking the connection to see if it's either open, idle or connecting, and reusing the connection if so. Otherwise, redialing.
grpc will manage your connections for you, reconnecting when needed, so you should never need to monitor it after creating it unless you have very specific needs.
"transport is closing" has many different reasons for happening; please see the relevant question in our FAQ and let us know if you still have questions: https://github.com/grpc/grpc-go#the-rpc-failed-with-error-code--unavailable-desc--transport-is-closing
I had about the same issue earlier this year . After about 15 minuets I had servers close the connection.
My solution which is working was to create my connection with grpc.Dial once on my main function then create the pb.NewAppClient(connection) on each request. Since the connection was already created latency wasn't an issue. After the request was done I closed the client.

FTP data connections reuse

I am working on an FTP client for kicks and I am trying to understand the workflow of data connections.
As I understand, the initial (command) connection is permanent until you quit. However, I am unsure of the data connection - is it re-initiated per-command? So you call PORT ... or PASV, get a second connection, do a LIST, get the results, connection closes, start over?
Also, do you need to call PASV (or PORT ...) again after each connection closes? It seems that when I try to test some things out using a passive connection, I cannot re-connect to the same port after the first command has returned the results and closed the data connection. I can keep calling PASV -> Data Connect -> Run Command -> Get Results -> Data Connection closed -> PASV, but it seems like it's not how it's meant to run?
Also, if someone has a good material on FTP that is more terse than the RFC I really appreciate it.
You have to open a new connection every time. It's only the closing of the connection, how you (or the server) can tell that the transfer completed (at least in the common "stream mode").
You cannot even reuse the local/remote port number combination, as when a TCP connection is closed, it enters TIME_WAIT mode and the port number combination cannot be used for some time. So for two immediately consecutive transfers you have to use a different port number combination anyway.
Refer to RFC 959, section 3.3. Data management:
Reuse of the Data Connection: When using the stream mode of data
transfer the end of the file must be indicated by closing the
connection. This causes a problem if multiple files are to be
transfered in the session, due to need for TCP to hold the
connection record for a time out period to guarantee the reliable
communication. Thus the connection can not be reopened at once.
There are two solutions to this problem. The first is to
negotiate a non-default port. The second is to use another
transfer mode.
A comment on transfer modes. The stream transfer mode is
inherently unreliable, since one can not determine if the
connection closed prematurely or not. The other transfer modes
(Block, Compressed) do not close the connection to indicate the
end of file. They have enough FTP encoding that the data
connection can be parsed to determine the end of the file.
Thus using these modes one can leave the data connection open
for multiple file transfers.
See also:
Why does FTP passive mode require a port range as opposed to only one port?
How many data channel ports do I need for an FTPS server running behind NAT?

tcp and apache keepalivetimouts

A few weeks ago I wrote a small program which created a socket to an apache webserver and made a request.
Back then I did not know that this web server had a KeepAliveTimeout of 5 seconds.
After my first request I waited 1 minute. After this I wanted to reuse my first socket for another webserver request, but got an error.
From Beej's Guide to Network Programming I learned that if recv returns 0, then the other side has closed its connection:
Wait! recv() can return 0. This can mean only one thing: the remote side has closed
the connection on you! A return value of 0 is recv()'s way of letting you know this
has occurred.
My questions are now:
What does Apache send when the KeepAliveTimeout is over - a FIN or a RST packet?
I know that using a TCP connection for 2 unrelated HTTP requests like in this scenario might
not be the best thing. But in order to understand TCP more the next question is:
After my first successful http request, and before sending the next HTTP request over the same socket, would there be somehow a possibility to get informed about this keepalivetimeout TCPsocket termination of the server other than receiving 0 from the next recv() call?
It will send a FIN. If you write a request to the server after that, send() will return -1 with errno/WSAGetLastError() = ECONNRESET.
would there be somehow a possibility to get informed about this keepalivetimeout tcp socket termination of the server
Yes, by reading the proper response header parameter, namely Keep-Alive: timeout=delta-seconds:
'timeout' Parameter
A host sets the value of the timeout parameter to the time that the host will allows an idle connection to remain open before it is closed. A connection is idle if no data is sent or received by a host.
The value of the timeout parameter is a single integer in seconds.
A host MAY keep an idle connection open for longer than the time that it indicates, but it SHOULD attempt to retain a connection for at least as long as indicated.
As you can see, it's up to the host to decide. Given it only SHOULD try to keep the connection open as long as promised, but it isn't required that it does in order to conform to the spec, so the server might decide to close and reuse the connection to serve another pending client.

Frozen connection using ssh over Amazon EC2 using ubuntu

When I am connected to Amazon EC2 using the secure shell and don't type anything for a few minutes, everything freezes. I can't type anything or exit. After a few minutes I get a message from the server...
Last login: Fri Dec 6 23:21:28 2013 from pool-173-52-249-158.nycmny.east.verizon.net
ubuntu#ip-172-31-31-33:~$ Write failed: Broken pipe
Some of you have had to have this problem before. If you could shed some light on the situation for a newb using the cloud.
Try below options:
Explore ServerAliveCountMax and ServerAliveInterval. These settings are set in /etc/ssh/ssh_config on SSH client side.
from man ssh_config:
ServerAliveCountMax
Sets the number of server alive messages (see below) which may be sent without ssh(1) receiving any mes‐
sages back from the server. If this threshold is reached while server alive messages are being sent, ssh
will disconnect from the server, terminating the session. It is important to note that the use of server
alive messages is very different from TCPKeepAlive (below). The server alive messages are sent through
the encrypted channel and therefore will not be spoofable. The TCP keepalive option enabled by
TCPKeepAlive is spoofable. The server alive mechanism is valuable when the client or server depend on
knowing when a connection has become inactive.
The default value is 3. If, for example, ServerAliveInterval (see below) is set to 15 and
ServerAliveCountMax is left at the default, if the server becomes unresponsive, ssh will disconnect after
approximately 45 seconds. This option applies to protocol version 2 only; in protocol version 1 there is
no mechanism to request a response from the server to the server alive messages, so disconnection is the
responsibility of the TCP stack.
And
ServerAliveInterval
Sets a timeout interval in seconds after which if no data has been received from the server, ssh(1) will
send a message through the encrypted channel to request a response from the server. The default is 0,
indicating that these messages will not be sent to the server, or 300 if the BatchMode option is set.
This option applies to protocol version 2 only. ProtocolKeepAlives and SetupTimeOut are Debian-specific
compatibility aliases for this option.
Also similar settings are available from the server side which are ClientAliveInterval and ClientAliveCountMax. These settings palced in /etc/ssh/sshd_config on Server side.
from man sshd_config:
ClientAliveCountMax
Sets the number of client alive messages (see below) which may be sent without sshd(8) receiving any mes‐
sages back from the client. If this threshold is reached while client alive messages are being sent,
sshd will disconnect the client, terminating the session. It is important to note that the use of client
alive messages is very different from TCPKeepAlive (below). The client alive messages are sent through
the encrypted channel and therefore will not be spoofable. The TCP keepalive option enabled by
TCPKeepAlive is spoofable. The client alive mechanism is valuable when the client or server depend on
knowing when a connection has become inactive.
The default value is 3. If ClientAliveInterval (see below) is set to 15, and ClientAliveCountMax is left
at the default, unresponsive SSH clients will be disconnected after approximately 45 seconds. This
option applies to protocol version 2 only.
And
ClientAliveInterval
Sets a timeout interval in seconds after which if no data has been received from the client, sshd(8) will
send a message through the encrypted channel to request a response from the client. The default is 0,
indicating that these messages will not be sent to the client. This option applies to protocol version 2
only.
Looks like your firewall (from different locations) are dropping the sessions due to inactivity.
I would try just like #slayedbylucifer stated something like this in your ~/.ssh/config
Host *
ServerAliveInterval 60

Resources