Go HTTP idle connection pool and http trace - go

I have a long running GO program(version 1.18) which sends hundreds of GET requests simultaneously per second using RESTY to the remote https://api.abcd.com. Each GET request is a separate go-routine which uses the same RESTY client.
remote server https://api.abcd.com is nginx/1.19.2(HTTP/2), IP address is 11.11.11.11 and 22.22.22.22. Yes, this remoter server has multiple IP addresses.
I use hostname when setting RESTY client
SetBaseURL("https://api.abcd.com")
Transport configuration are default one in RESTY.
TraceInfo() is enabled on RESTY client side. There is a "IsConnReused" field in the trace info. This IsConnReused actually comes from struct GotConnInfo in GO httptrace package:
type GotConnInfo struct {
Conn net.Conn
// Reused is whether this connection has been previously used for another HTTP request.
Reused bool
// WasIdle is whether this connection was obtained from anidle pool.
WasIdle bool
// IdleTime reports how long the connection was previously idle, if WasIdle is true.
IdleTime time.Duration
}
question 1: GO httptrace determine "Connection reused" based on hostname(api.abcd.com) or IP address?
question 2: GO http package idle connection pool is actually a map, key is a struct type connectMethodKey. The addr field in this struct is hostname or IP address?
type connectMethodKey struct {
proxy, scheme, addr string
onlyH1 bool
}
This is what I found in TraceInfo(). When the program runs at the beginning, all requests are sent to 11.11.11.11:443. Few minutes later, all requests are sent to 22.22.22.22, no 11.11.11.11 anymore. Then, few minutes later, all requests start to sent to 11.11.11.11 again, no 22.22.22.22 this time.
question 3: when requests start to sent to 22.22.22.22, the socket connections to 11.11.11.11 are idle, why GO http does not use idle connections anymore? I don't think those idle connection has already timeout.

question 1: GO httptrace determine "Connection reused" based on
hostname(api.abcd.com) or IP address?
httptrace.GotConnInfo.Reused tracks if TCP connection was used for another HTTP request. It is per IP address.
question 2: GO http package idle connection pool is actually a map,
key is a struct type connectMethodKey. The addr field in this struct
is hostname or IP address?
addr is hostname
Could be an IP though if you send request to something like http://127.0.0.1/.
question 3: when requests start to sent to 22.22.22.22, the socket
connections to 11.11.11.11 are idle, why GO http does not use idle
connections anymore? I don't think those idle connection has already
timeout.
It could work differently if you use HTTP 1.
With it, every request requires the own TCP connection. Subsequent requests may reuse TCP connection, but if you want to run requests in parallel, you need to establish multiple TCP connections. Every connection would use a different IP address, and you would see traffic evenly distributed.
With HTTP/2 a single TCP connection can be used for multiple parallel requests. That connection uses a single IP address.
This is how GO calculates if a new request can use open connection:
https://cs.opensource.google/go/x/net/+/69896b71:http2/transport.go;l=881;drc=69896b714898bee1e3403560cd2e1870bcc8bd35;bpv=1;bpt=1
Play with these prams to distribute the traffic across multiple TCP connections.

Related

OKHttp3: how to retry another IP address if one is unreachable

does OkHttp3 support the following case:
x.x.x.x myapp.com
y.y.y.y myapp.com
we have two IPs for one hostname, looks like OkHttpClient always retry the first IP address instead of trying another available IP address.
does retryOnConnectionFailure(true) support this? from the doc, by default it should support this?
Configure this client to retry or not when a connectivity problem is encountered. By default, this client silently recovers from the following problems:
Unreachable IP addresses. If the URL’s host has multiple IP
addresses, failure to reach any individual IP address doesn’t fail
the overall request. This can increase availability of multi-homed
services.
Stale pooled connections. The ConnectionPool reuses sockets to
decrease request latency, but these connections will occasionally
time out.
Unreachable proxy servers. A ProxySelector can be used to attempt
multiple proxy servers in sequence, eventually falling back to a
direct connection.
Set this to false to avoid retrying requests when doing so is
destructive. In this case the calling application should do its own
recovery of connectivity failures.
OkHttp will try both in sequence.

Multiple clients - one server - one port? [duplicate]

This question already has answers here:
Does the port change when a server accepts a TCP connection?
(3 answers)
Closed 4 years ago.
I understand the basics of how ports work. However, what I don't get is how multiple clients can simultaneously connect to say port 80. I know each client has a unique (for their machine) port. Does the server reply back from an available port to the client, and simply state the reply came from 80? How does this work?
First off, a "port" is just a number. All a "connection to a port" really represents is a packet which has that number specified in its "destination port" header field.
Now, there are two answers to your question, one for stateful protocols and one for stateless protocols.
For a stateless protocol (ie UDP), there is no problem because "connections" don't exist - multiple people can send packets to the same port, and their packets will arrive in whatever sequence. Nobody is ever in the "connected" state.
For a stateful protocol (like TCP), a connection is identified by a 4-tuple consisting of source and destination ports and source and destination IP addresses. So, if two different machines connect to the same port on a third machine, there are two distinct connections because the source IPs differ. If the same machine (or two behind NAT or otherwise sharing the same IP address) connects twice to a single remote end, the connections are differentiated by source port (which is generally a random high-numbered port).
Simply, if I connect to the same web server twice from my client, the two connections will have different source ports from my perspective and destination ports from the web server's. So there is no ambiguity, even though both connections have the same source and destination IP addresses.
Ports are a way to multiplex IP addresses so that different applications can listen on the same IP address/protocol pair. Unless an application defines its own higher-level protocol, there is no way to multiplex a port. If two connections using the same protocol simultaneously have identical source and destination IPs and identical source and destination ports, they must be the same connection.
Important:
I'm sorry to say that the response from "Borealid" is imprecise and somewhat incorrect - firstly there is no relation to statefulness or statelessness to answer this question, and most importantly the definition of the tuple for a socket is incorrect.
First remember below two rules:
Primary key of a socket: A socket is identified by {SRC-IP, SRC-PORT, DEST-IP, DEST-PORT, PROTOCOL} not by {SRC-IP, SRC-PORT, DEST-IP, DEST-PORT} - Protocol is an important part of a socket's definition.
OS Process & Socket mapping: A process can be associated with (can open/can listen to) multiple sockets which might be obvious to many readers.
Example 1: Two clients connecting to same server port means: socket1 {SRC-A, 100, DEST-X,80, TCP} and socket2{SRC-B, 100, DEST-X,80, TCP}. This means host A connects to server X's port 80 and another host B also connects to the same server X to the same port 80. Now, how the server handles these two sockets depends on if the server is single-threaded or multiple-threaded (I'll explain this later). What is important is that one server can listen to multiple sockets simultaneously.
To answer the original question of the post:
Irrespective of stateful or stateless protocols, two clients can connect to the same server port because for each client we can assign a different socket (as the client IP will definitely differ). The same client can also have two sockets connecting to the same server port - since such sockets differ by SRC-PORT. With all fairness, "Borealid" essentially mentioned the same correct answer but the reference to state-less/full was kind of unnecessary/confusing.
To answer the second part of the question on how a server knows which socket to answer. First understand that for a single server process that is listening to the same port, there could be more than one socket (maybe from the same client or from different clients). Now as long as a server knows which request is associated with which socket, it can always respond to the appropriate client using the same socket. Thus a server never needs to open another port in its own node than the original one on which the client initially tried to connect. If any server allocates different server ports after a socket is bound, then in my opinion the server is wasting its resource and it must be needing the client to connect again to the new port assigned.
A bit more for completeness:
Example 2: It's a very interesting question: "can two different processes on a server listen to the same port". If you do not consider protocol as one of the parameters defining sockets then the answer is no. This is so because we can say that in such a case, a single client trying to connect to a server port will not have any mechanism to mention which of the two listening processes the client intends to connect to. This is the same theme asserted by rule (2). However, this is the WRONG answer because 'protocol' is also a part of the socket definition. Thus two processes in the same node can listen to the same port only if they are using different protocols. For example, two unrelated clients (say one is using TCP and another is using UDP) can connect and communicate to the same server node and to the same port but they must be served by two different server processes.
Server Types - single & multiple:
When a server processes listening to a port that means multiple sockets can simultaneously connect and communicate with the same server process. If a server uses only a single child process to serve all the sockets then the server is called single-process/threaded and if the server uses many sub-processes to serve each socket by one sub-process then the server is called a multi-process/threaded server. Note that irrespective of the server's type a server can/should always use the same initial socket to respond back (no need to allocate another server port).
Suggested Books and the rest of the two volumes if you can.
A Note on Parent/Child Process (in response to query/comment of 'Ioan Alexandru Cucu')
Wherever I mentioned any concept in relation to two processes say A and B, consider that they are not related by the parent-child relationship. OS's (especially UNIX) by design allows a child process to inherit all File-descriptors (FD) from parents. Thus all the sockets (in UNIX like OS are also part of FD) that process A listening to can be listened to by many more processes A1, A2, .. as long as they are related by parent-child relation to A. But an independent process B (i.e. having no parent-child relation to A) cannot listen to the same socket. In addition, also note that this rule of disallowing two independent processes to listen to the same socket lies on an OS (or its network libraries), and by far it's obeyed by most OS's. However, one can create own OS which can very well violate this restriction.
TCP / HTTP Listening On Ports: How Can Many Users Share the Same Port
So, what happens when a server listen for incoming connections on a TCP port? For example, let's say you have a web-server on port 80. Let's assume that your computer has the public IP address of 24.14.181.229 and the person that tries to connect to you has IP address 10.1.2.3. This person can connect to you by opening a TCP socket to 24.14.181.229:80. Simple enough.
Intuitively (and wrongly), most people assume that it looks something like this:
Local Computer | Remote Computer
--------------------------------
<local_ip>:80 | <foreign_ip>:80
^^ not actually what happens, but this is the conceptual model a lot of people have in mind.
This is intuitive, because from the standpoint of the client, he has an IP address, and connects to a server at IP:PORT. Since the client connects to port 80, then his port must be 80 too? This is a sensible thing to think, but actually not what happens. If that were to be correct, we could only serve one user per foreign IP address. Once a remote computer connects, then he would hog the port 80 to port 80 connection, and no one else could connect.
Three things must be understood:
1.) On a server, a process is listening on a port. Once it gets a connection, it hands it off to another thread. The communication never hogs the listening port.
2.) Connections are uniquely identified by the OS by the following 5-tuple: (local-IP, local-port, remote-IP, remote-port, protocol). If any element in the tuple is different, then this is a completely independent connection.
3.) When a client connects to a server, it picks a random, unused high-order source port. This way, a single client can have up to ~64k connections to the server for the same destination port.
So, this is really what gets created when a client connects to a server:
Local Computer | Remote Computer | Role
-----------------------------------------------------------
0.0.0.0:80 | <none> | LISTENING
127.0.0.1:80 | 10.1.2.3:<random_port> | ESTABLISHED
Looking at What Actually Happens
First, let's use netstat to see what is happening on this computer. We will use port 500 instead of 80 (because a whole bunch of stuff is happening on port 80 as it is a common port, but functionally it does not make a difference).
netstat -atnp | grep -i ":500 "
As expected, the output is blank. Now let's start a web server:
sudo python3 -m http.server 500
Now, here is the output of running netstat again:
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:500 0.0.0.0:* LISTEN -
So now there is one process that is actively listening (State: LISTEN) on port 500. The local address is 0.0.0.0, which is code for "listening for all". An easy mistake to make is to listen on address 127.0.0.1, which will only accept connections from the current computer. So this is not a connection, this just means that a process requested to bind() to port IP, and that process is responsible for handling all connections to that port. This hints to the limitation that there can only be one process per computer listening on a port (there are ways to get around that using multiplexing, but this is a much more complicated topic). If a web-server is listening on port 80, it cannot share that port with other web-servers.
So now, let's connect a user to our machine:
quicknet -m tcp -t localhost:500 -p Test payload.
This is a simple script (https://github.com/grokit/dcore/tree/master/apps/quicknet) that opens a TCP socket, sends the payload ("Test payload." in this case), waits a few seconds and disconnects. Doing netstat again while this is happening displays the following:
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:500 0.0.0.0:* LISTEN -
tcp 0 0 192.168.1.10:500 192.168.1.13:54240 ESTABLISHED -
If you connect with another client and do netstat again, you will see the following:
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:500 0.0.0.0:* LISTEN -
tcp 0 0 192.168.1.10:500 192.168.1.13:26813 ESTABLISHED -
... that is, the client used another random port for the connection. So there is never confusion between the IP addresses.
Normally, for every connecting client the server forks a child process that communicates with the client (TCP). The parent server hands off to the child process an established socket that communicates back to the client.
When you send the data to a socket from your child server, the TCP stack in the OS creates a packet going back to the client and sets the "from port" to 80.
Multiple clients can connect to the same port (say 80) on the server because on the server side, after creating a socket and binding (setting local IP and port) listen is called on the socket which tells the OS to accept incoming connections.
When a client tries to connect to server on port 80, the accept call is invoked on the server socket. This creates a new socket for the client trying to connect and similarly new sockets will be created for subsequent clients using same port 80.
Words in italics are system calls.
Ref
http://www.scs.stanford.edu/07wi-cs244b/refs/net2.pdf

How to get the IP address of a connected WebSocket-client?

I'm currently working on a ABAP Push Channel server to WebSocket client connection and I need the IP-address of the client in order to identify whether this client is the one I want to send the message to. In my scenario there could be multiple WebSocket connections.
Now there is the ssi_websocket_table table and the ssi_websocket_table_row row with the the field caller_ip, however this gives me the IP address of the DNS-Server of the network I'm connected to, and I expected the IP address of my local PC since the WebSocket-client is running on this machine.
Is there any other way to get the clients IP address from an active WebSocket connection in ABAP?
P.S. Looking at all the table entries, it shows the correct IP when using a different server configuration, as soon as I know why that's the case I will report back.
As pointed out by vwegert it makes no sense to use the IP to tell the WebSockets apart, I think it would probably be better to use an ID for each WebSocket connection instead.
You could get the IP from the WebSocket server context which gets the IP header apparently from the opening HTTP handshake for the connection:
DATA(lo_context) = i_context. " IF_APC_WSP_SERVER_CONTEXT type
DATA(lo_request) = lo_context->get_initial_request( ).
" initialize G_CONTEXT_ID_FIELD for PCP_SET_CONTEXT_FIELDS
DATA(lv_id) = lo_request->get_header_field( if_http_header_fields_sap=>remote_addr ).
the sample is taken from the SAP standard class CL_APC_WS_EXT_ABAP_ONLINE_COMM, ON_MESSAGE method.

tcp and apache keepalivetimouts

A few weeks ago I wrote a small program which created a socket to an apache webserver and made a request.
Back then I did not know that this web server had a KeepAliveTimeout of 5 seconds.
After my first request I waited 1 minute. After this I wanted to reuse my first socket for another webserver request, but got an error.
From Beej's Guide to Network Programming I learned that if recv returns 0, then the other side has closed its connection:
Wait! recv() can return 0. This can mean only one thing: the remote side has closed
the connection on you! A return value of 0 is recv()'s way of letting you know this
has occurred.
My questions are now:
What does Apache send when the KeepAliveTimeout is over - a FIN or a RST packet?
I know that using a TCP connection for 2 unrelated HTTP requests like in this scenario might
not be the best thing. But in order to understand TCP more the next question is:
After my first successful http request, and before sending the next HTTP request over the same socket, would there be somehow a possibility to get informed about this keepalivetimeout TCPsocket termination of the server other than receiving 0 from the next recv() call?
It will send a FIN. If you write a request to the server after that, send() will return -1 with errno/WSAGetLastError() = ECONNRESET.
would there be somehow a possibility to get informed about this keepalivetimeout tcp socket termination of the server
Yes, by reading the proper response header parameter, namely Keep-Alive: timeout=delta-seconds:
'timeout' Parameter
A host sets the value of the timeout parameter to the time that the host will allows an idle connection to remain open before it is closed. A connection is idle if no data is sent or received by a host.
The value of the timeout parameter is a single integer in seconds.
A host MAY keep an idle connection open for longer than the time that it indicates, but it SHOULD attempt to retain a connection for at least as long as indicated.
As you can see, it's up to the host to decide. Given it only SHOULD try to keep the connection open as long as promised, but it isn't required that it does in order to conform to the spec, so the server might decide to close and reuse the connection to serve another pending client.

TCP Hole Punching

I'm trying to implement TCP hole punching with windows socket using mingw toolchain. I think the process is right but the hole doesn't seems to take. I used this as reference.
A and B connect to the server S
S sends to A, B's router IP + the port it used to connect to S
S does the same for B
A start 2 threads:
One thread tries connecting to B's router with the info sent by S
The other thread is waiting for an incoming connection on the same port used to connect to its router when it connected to S
B does the same
I have no issue in the code I think since:
A and B does get each other ip and port to use
They are both listening on the port they used to connect to their router when they contacted the server
They are both connecting to the right ip and port but get timed out (code error 10060)
I am missing something ?
EDIT: With the help of process explorer, I see that one of the client managed to establish a connection to the peer. But the peer doesn't seems to consider the connection to be made.
Here is what I captured with Wireshark. For the sake of the example, the server S and the client A are on the same PC. The server S listens on a specific port (8060) redirected to that PC. B still tries to connect on the right IP because it sees that the public address of A sent by S is localhost and therefore uses the public IP of S instead. (I have replaced the public IPs by placeholders)
EDIT 2: I think the confusion is due to the fact that both incoming and outcoming connection request data are transfered on the same port. Which seems to mess up the connection state because we don't know which socket will get the data from the port. If I quote msdn:
The SO_REUSEADDR socket option allows a socket to forcibly bind to a
port in use by another socket. The second socket calls setsockopt with
the optname parameter set to SO_REUSEADDR and the optval parameter set
to a boolean value of TRUE before calling bind on the same port as the
original socket. Once the second socket has successfully bound, the
behavior for all sockets bound to that port is indeterminate.
But talking on the same port is required by the TCP Hole Punching technique to open up the holes !
A start 2 threads:
One thread tries connecting to B's router with the info sent by S
The other thread is waiting for an incoming connection on the same port used to connect to its router when it connected to S
You can't do this with two threads, since it's just one operation. Every TCP connection that is making an outbound connection is also waiting for an incoming connection. You simply call 'connect', and you are both sending outbound SYNs to make a connection and waiting for inbound SYNs to make a connection.
You may, however, need to close your connection to the server. Your platform likely doesn't permit you to make a TCP connection from a port when you already have an established connection from that same port. So just as you start TCP hole punching, close the connection to the server. Bind a new TCP socket to that same port, and call connect.
A simple solution to traverse into NAT routers is to make your traffic follow a protocol that your NAT already has an algorithm for forwarding, such as FTP.
Use Wireshark to check tcp connection request(3-way Handhsake process) is going properly.
Ensure your Listener thread is having select() to de-multiplex the descriptor.
sockPeerConect(socket used to connect Other peer) is FD_SET() in Listener Thread.
Ensure your are checking
int Listener Thread()
{
while(true)
{
FD_SET(sockPeerConn);
FD_SET(sockServerConn);
FD_SET(nConnectedSock );
if (FD_ISSET(sockPeerConect)
{
/// and calling accept() in side the
nConnectedSock = accept( ....);
}
if (FD_ISSET(sockServerConn)
{
/// receive data from Server
recv(sockServerConn );
}
if (FD_ISSET(nConnectedSock )
{
/// Receive data from Other Peer
recv(nConnectedSock );
}
}
}
5.Ensure you are simultaneously starting peer connection A to B and B to A.
6.Start your Listener Thread Prior to Connection to server and Peer and have Single Listener Thread for receiving Server and Client.
not every router supports tcp hole punching, please check out the following paper which explains in detail:
Peer-to-Peer Communication Across Network Address Translators

Resources