Windows TCP socket recv delay - windows

External controller sends 120-bytes message through TCP/IP socket every 30ms.
Application receives this messages through standard tcp/ip socket recv function.
It works perfectly under Linux & OSX (recv returns 120-bytes messages every 30ms).
Under Windows recv returns ~3500 bytes buffer about every 1 sec. Rest of time it returns 0.
Wireshark under Windows shows messages indeed coming every 30ms.
How to make windows tcp socket work properly (without delay) ?
PS: I've played with TCP_NODELAY & TcpAckFrequency already. Wireshark shows everything is ok. So I think it's some Windows optimization, that should be turned off.
Reading--
int WMaster::DataRead(void)
{
if (!open_ok) return 0;
if (!CheckSocket())
{
PrintErrNo();
return 0;
}
iResult = recv(ConnectSocket, (char *)input_buff,sizeof(input_buff),0);
nError=WSAGetLastError();
if(nError==0) return iResult;
if(nError==WSAEWOULDBLOCK) return iResult;
PrintErrNo();
return 0;
}
Initialization-
ConnectSocket = INVALID_SOCKET;
iResult = WSAStartup(MAKEWORD(2,2), &wsaData);
ConnectSocket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
ZeroMemory(&clientService, sizeof(clientService));
clientService.sin_family = AF_INET;
clientService.sin_addr.s_addr = inet_addr( deviceName.toLatin1().constData() );
clientService.sin_port = htons( port);
iResult = setsockopt(ConnectSocket, IPPROTO_TCP, TCP_NODELAY, (char *) &flag,
sizeof (int));
u_long iMode=1;
iResult=ioctlsocket(ConnectSocket,FIONBIO,&iMode);
iResult = ::connect( ConnectSocket, (SOCKADDR*) &clientService,
sizeof(clientService) );
CheckSocket -
bool WMaster::CheckSocket(void)
{
socklen_t len = sizeof (int);
int retval = getsockopt (ConnectSocket, SOL_SOCKET, SO_ERROR, (char*)(&valopt), &len );
if (retval!=0)
{
open_ok=false;
return false;
};
return true;
}

Consider disabling the Nagle algorithm. 120-bytes is quite small and its possible that data is being buffered before being sent. Another reason I think it is the Nagle Algorithm is that about 33 sends should happen in 1 second. That corresponds with 33*120 = 3960 bytes / sec very similar to the 3500 you are seeing.

Change your dataread function as follows such that WSAGetLastError is only called when there is an error.
int WMaster::DataRead(void)
{
if (!open_ok) return 0;
if (!CheckSocket())
{
PrintErrNo();
return 0;
}
iResult = recv(ConnectSocket, (char *)input_buff,sizeof(input_buff),0);
if(iResult >= 0)
{
return iResult;
}
nError=WSAGetLastError();
if(nError==WSAEWOULDBLOCK) return iResult;
PrintErrNo();
return 0;
}
The fact that you are polling the socket every millisecond may have something to do with your performance problem. But I'd like to see the source to CheckSocket before concluding that as the problem.

Related

OS freeze while trying to send UDP packet from linux kernel

I'm modifying UDP to implement a custom protocol. After UDP connect establishes a route, I want to send a custom UDP packet to the destination (like a SYN packet in TCP). When I try the connect() socket function on a machine running my custom kernel, it freezes without writing out anything to the kernel log. Here's my code
int quic_connect(struct sock *sk, struct flowi4 *fl4, struct rtable *rt){
struct sk_buff *skb, *buff;
struct inet_cork cork;
struct ipcm_cookie ipc;
struct sk_buff_head queue;
char *hello;
int err = 0, exthdrlen, hh_len, datalen, trailerlen;
char *data;
hh_len = LL_RESERVED_SPACE(rt->dst.dev);
exthdrlen = rt->dst.header_len;
trailerlen = rt->dst.trailer_len;
datalen = 200;
//Create a buffer to be send without fragmentation
skb = sock_alloc_send_skb(sk,
exthdrlen + datalen + hh_len + trailerlen + 15,
MSG_DONTWAIT, &err);
if (skb == NULL)
goto out;
skb->ip_summed = CHECKSUM_PARTIAL; // Use hardware checksum
skb->csum = 0;
skb_reserve(skb, hh_len);
skb_shinfo(skb)->tx_flags = 1; //Time stamp the packet
/*
* Find where to start putting bytes.
*/
data = skb_put(skb, datalen + exthdrlen);
skb_set_network_header(skb, exthdrlen);
skb->transport_header = (skb->network_header +
sizeof(struct iphdr));
__skb_queue_head_init(&queue);
/*
* Put the packet on the pending queue.
*/
__skb_queue_tail(&queue, skb);
cork.flags = 0;
cork.addr = 0;
cork.opt = NULL;
ipc.opt = NULL;
ipc.tx_flags = 0;
ipc.ttl = 0;
ipc.tos = -1;
ipc.addr = fl4->daddr;
err = ip_setup_cork(sk, &cork, &ipc, &rt);
buff = __ip_make_skb(sk, fl4, &queue, &cork);
kfree(skb);
err = PTR_ERR(buff);
if (!IS_ERR_OR_NULL(buff))
err = udp_send_skb(buff, fl4);
out:
return err;
}
The function quic_connect is called at the end of the ip4_datagram_connect function which is the registered handler for UDP connect.
There is absolutely nothing in the kernel log.
What am I doing wrong here?
**EDIT 1: **The problem occurs at err = udp_send_skb(buff, fl4); as there is no issue when I comment out that line. so I'm assuming my sk_buff has not been formed correctly. Any ideas why?

SendMessage() to win32 application vc++

I have a win32 application project. but when the program get to a place like
let's say new_socket = accept(socket, (sockaddr *)&client, &c);
it get stuck. in these kind of scripts it makes me unable to use any other button, file menu and etc. is anybody that can tell me what is wrong and how am going to fix it.
this is the function where it get stuck:
void server(){
WSADATA wsa;
SOCKET server_socket, client_socket;
struct sockaddr_in server, client;
int c, yes=1;
int sent_length = 1;
if (WSAStartup(MAKEWORD(2,2),&wsa) != 0) printf("Failed. Error Code : %d",WSAGetLastError());
if((server_socket = socket(AF_INET , SOCK_STREAM , 0 )) == INVALID_SOCKET){
printf("Could not create socket : %d" , WSAGetLastError());
}
server.sin_family = AF_INET;
server.sin_addr.s_addr = 0;
server.sin_port = htons(8080);
memset(&(server.sin_zero), '\0', 8);
bind(server_socket ,(struct sockaddr *)&server , sizeof(server));
listen(server_socket, 3);
c = sizeof(struct sockaddr_in);
while(1){
client_socket = accept(server_socket,(struct sockaddr *)&client, &c);
send(client_socket, "Hello, World", 13, 0);
}
WSACleanup();
}
accept is syncronous call and not return, until client connect. this mast not be used in GUI thread. need or do this call in another thread or (the best) use only asynchronous api (AcceptEx)

how to work CORRECTLY with SSL_read() and select()?

I try to make a C++ TLS client with OpenSSL which use non-blocking socket on Windows.
I want to work with SSL_read()/SSL_write() and select() functions but I don't find the algorithme which work well and the net not provide good and simple exemple. There is allready a timeout return by select() after the last block of data recved.
I don't understand OpenSSL api, SSL_pending() return already 0 and select a time out??
Select cause a criticale delay at last bloc of data.
My algorithme for recv_buffer() is this:
I have function which check if a socket is readable or writeable (work well):
int CSocket::socket_RWable(int rw_flag, const int time_out)
{
fd_set rwfs;
int error = 0;
struct timeval timeout;
try
{
memset(&timeout, 0, sizeof(struct timeval));
timeout.tv_sec = time_out;
while( 1 ) // boucle de surveillance
{
FD_ZERO(&rwfs);
FD_SET(m_socket, &rwfs);
// surveiller la socket en lecture ou ecriture
if(rw_flag == R_MODE)
error = select(m_socket+1, &rwfs, NULL, NULL, &timeout);
else if(rw_flag == W_MODE)
error = select(m_socket+1, NULL, &rwfs, NULL, &timeout);
if(error < 0) // echec de select
throw 1;
else if(error == 0) // fin du time out
throw 2;
// Une opération d' entree/sortie sur la socket est disponible
if(FD_ISSET(m_socket, &rwfs) != 0)
{
FD_CLR(m_socket, &rwfs );
return 0;
}
}
}
catch(int ret)
{
FD_CLR(m_socket, &rwfs );
if(ret == 1) throw CErreur("[-] CSocket : select : ", CWinUtil::Win_sys_error(NET_ERROR));
else if(ret == 2) return -1;
}
return -1;
}
UPDATE:
and this function recve the data into a buffer and cause a time out after the las block of data:
int CTLSClient::recv_buffer(char *buffer, const int buffer_size, const int time_out)
{
int selectErr = 0;
int sslErr = 0;
int retRead = 0;
int recvData = 0;
selectErr = m_socket->socket_RWable(R_MODE, time_out);
while(selectErr == 0)
{
retRead = SSL_read(m_ssl, buffer+recvData, buffer_size-recvData);
sslErr = SSL_get_error(m_ssl, retRead);
if(sslErr == SSL_ERROR_NONE)
{
cout<<"DEBUG 2 SSL_ERROR_NONE recv data="<<retRead<<endl;
recvData += retRead;
}
else if(sslErr == SSL_ERROR_WANT_READ)
{
cout<<"DEBUG 3 SSL_ERROR_WANT_READ select()"<<endl;
selectErr = m_socket->socket_RWable(R_MODE, time_out);
}
else if(sslErr == SSL_ERROR_WANT_WRITE)
{
cout<<"DEBUG 4 SSL_ERROR_WANT_WRITE select()"<<endl;
selectErr = m_socket->socket_RWable(W_MODE, time_out);
}
else if(sslErr == SSL_ERROR_ZERO_RETURN)
{
return -1;
}
else
return -1;
}
return recvData;
}
this is a output with connection to a POP3 server:
DEBUG 2 SSL_ERROR_NONE recv data=35
DEBUG 3 SSL_ERROR_WANT_READ select()
[S]+OK BLU0-POP617 POP3 server ready
total data -> 35
DEBUG 2 SSL_ERROR_NONE recv data=23
DEBUG 3 SSL_ERROR_WANT_READ select()
[S]+OK password required
total data -> 23
DEBUG 2 SSL_ERROR_NONE recv data=30
DEBUG 3 SSL_ERROR_WANT_READ select()
[S]+OK mailbox has 180 messages
total data -> 30
DEBUG 2 SSL_ERROR_NONE recv data=18
DEBUG 3 SSL_ERROR_WANT_READ select()
[S]+OK 180 12374432
total data -> 18
DEBUG 2 SSL_ERROR_NONE recv data=13
DEBUG 3 SSL_ERROR_WANT_READ select()
[S]+OK 1 23899
total data -> 13
DEBUG 2 SSL_ERROR_NONE recv data=5
DEBUG 3 SSL_ERROR_WANT_READ select()
DEBUG 2 SSL_ERROR_NONE recv data=8192
DEBUG 2 SSL_ERROR_NONE recv data=8192
DEBUG 3 SSL_ERROR_WANT_READ select()
DEBUG 3 SSL_ERROR_WANT_READ select()
DEBUG 2 SSL_ERROR_NONE recv data=7521
DEBUG 3 SSL_ERROR_WANT_READ select()
[S]total data -> 23910
Assuming you have already read the headers, for some reason SSL_read() hangs after reading the email message and returns SSL_WANT_READ. I solved this problem by looping through the message body one line at a time until I find the ending period. When I reach this line, I call SSL_pending(). Although there is no pending data, it prevents an endless loop where SSL_read() returns SSL_WANT_READ. However, I am looking for a better solution.
for(;;)
{
char *line = ReadLine(ssl, buf, sizeof(buf));
if(line != NULL)
{
if(*line == '.')
{
int pending = SSL_pending(ssl);
if(pending > 0)
{
int read = SSL_read(ssl,buf,pending);
}
}
}
}
This function reads one character at a time until it reaches an end of line character and returns the line.
char *ReadLine(SSL *ssl, char *buf, int size)
{
int i = 0;
char *ptr = NULL;
for (ptr = str; size > 1; size--, ptr++)
{
i = SSL_read(out, ptr, 1);
switch (SSL_get_error(out, i)){
case SSL_ERROR_NONE:
break;
case SSL_ERROR_ZERO_RETURN:
break;
case SSL_ERROR_WANT_READ:
break;
case SSL_ERROR_WANT_WRITE:
break;
default:
TRACE("SSL problem\r\n");
}
if (*ptr == '\n')
break;
if (*ptr == '\r'){
ptr--;
}
}
*ptr = '\0';
return(str);
}

recv() only reads 1 byte (implementing an FTP with winsock)

I'm trying to implement a simple FTP client using winsock. I'm having problems trying to download a file. Here's the code I'm using at the moment:
bool FTPHandler::downloadFile(const char * remoteFilePath, const char * filePath) {
if (!isConnected()) {
setErrorMsg("Not connected, imposible to upload file...");
return false;
}
if (usePasiveMode) {
this->pasivePort = makeConectionPasive();
if (this->pasivePort == -1) {
//error msg will be setted by makeConectionPasive()
return false;
}
} else {
setErrorMsg("Unable to upload file not in pasive mode :S");
return false;
}
char * fileName = new char[500];
getFileName(remoteFilePath,fileName);
// Default name and path := current directory and same name as remote.
if (filePath == NULL) {
filePath = fileName;
}
if (!setDirectory(remoteFilePath)) {
return false;
}
char msg[OTHER_BUF_SIZE];
char serverMsg[SERVER_BUF_SIZE];
sprintf(msg,"%s%s\n",RETR_MSG,fileName);
send(sock, msg, strlen(msg), 0);
SOCKET passSocket;
SOCKADDR_IN passServer;
passSocket = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP);
if (passSocket == INVALID_SOCKET) {
WSACleanup();
sprintf(errorMsg,"Error trying to create socket (WSA error code: %d)",WSAGetLastError());
return false;
}
passServer.sin_family = PF_INET;
passServer.sin_port = htons(this->pasivePort);
passServer.sin_addr = *((struct in_addr *)gethostbyname(this->host)->h_addr);
memset(server.sin_zero,0,8);
int errorCode = connect(passSocket, (LPSOCKADDR) &passServer, sizeof(struct sockaddr));
int tries = 0;
while (errorCode == SOCKET_ERROR) {
tries++;
if (tries >= MAX_TRIES) {
closesocket(passSocket);
sprintf(errorMsg,"Error trying to create socket");
WSACleanup();
return false;
}
}
char * buffer = (char *) malloc(CHUNK_SIZE);
ofstream f(filePath);
Sleep(WAIT_TIME);
while (int readBytes = ***recv(passSocket, buffer, CHUNK_SIZE, 0)***>0) {
buffer[readBytes] = '\0';
f.write(buffer,readBytes);
}
f.close();
Sleep(WAIT_TIME);
recv(sock, serverMsg, OTHER_BUF_SIZE, 0);
if (!startWith(serverMsg, FILE_STATUS_OKEY_CODE)) {
sprintf(errorMsg,"Bad response: %s",serverMsg);
return false;
}
return true;
}
That last recv() returns 1 byte several times, and then the method ends and the file that should be around 1Kb is just 23 bytes.
Why isn't recv reading the hole file?
There are all kinds of logic holes and incorrect/missing error handling in this code. You really need to clean up this code in general.
You are passing the wrong sizeof() value to connect(), and not handling an error correctly if connect() fails (your retry loop is useless). You need to use sizeof(sockaddr_in) or sizeof(passServer) instead of sizeof(sockaddr). You are also not initializing passServer correctly.
You are not checking recv() for errors. And in the off-chance that recv() actually read CHUCK_SIZE number of bytes then you have a buffer overflow that will corrupt memory when you write the null byte into the buffer (which you do not need to do) because you are writing it past the boundaries of the buffer.
If connect() fails, or recv() fails with any error other than a server-side initiated disconnect, you are not telling the server to abort the transfer.
Once you tell the server to go into Passive mode, you need to connect to the IP/Port (not just the Port) that the server tells you, before you then send your RETR command.
Don't forget to send the server a TYPE command so it knows what format to send the file bytes in, such as TYPE A for ASCII text and TYPE I for binary data. If you try to transfer a file in the wrong format, you can corrupt the data. FTP's default TYPE is ASCII, not Binary.
And lastly, since you clearly do not seem to know how to program sockets effectively, I suggest you use the FTP portions of the WinInet library instead of WinSock directly, such as the FtpGetFile() function. Let WinInet handle the details of transferring FTP files for you.

Upper limit to UDP performance on windows server 2008

It looks like from my testing I am hitting a performance wall on my 10gb network. I seem to be unable to read more than 180-200k packets per second. Looking at perfmon, or task manager I can receive up to a million packets / second if not more. Testing 1 socket or 10 or 100, doesn't seem to change this limit of 200-300k packets a second. I've fiddled with RSS and the like without success. Unicast vs multicast doesn't seem to matter, overlapped i/o vs synchronous doesn't make a difference either. Size of packet doesn't matter either. There just seems to be a hard limit to the number of packets windows can copy from the nic to the buffer. This is a dell r410. Any ideas?
#include "stdafx.h"
#include <WinSock2.h>
#include <ws2ipdef.h>
static inline void fillAddr(const char* const address, unsigned short port, sockaddr_in &addr)
{
memset( &addr, 0, sizeof( addr ) );
addr.sin_family = AF_INET;
addr.sin_addr.s_addr = inet_addr( address );
addr.sin_port = htons(port);
}
int _tmain(int argc, _TCHAR* argv[])
{
#ifdef _WIN32
WORD wVersionRequested;
WSADATA wsaData;
int err;
wVersionRequested = MAKEWORD( 1, 1 );
err = WSAStartup( wVersionRequested, &wsaData );
#endif
int error = 0;
const char* sInterfaceIP = "10.20.16.90";
int nInterfacePort = 0;
//Create socket
SOCKET m_socketID = socket( AF_INET, SOCK_DGRAM, IPPROTO_UDP );
//Re use address
struct sockaddr_in addr;
fillAddr( "10.20.16.90", 12400, addr ); //"233.43.202.1"
char one = 1;
//error = setsockopt(m_socketID, SOL_SOCKET, SO_REUSEADDR , &one, sizeof(one));
if( error != 0 )
{
fprintf( stderr, "%s: ERROR setsockopt returned %d.\n", __FUNCTION__, WSAGetLastError() );
}
//Bind
error = bind( m_socketID, reinterpret_cast<SOCKADDR*>( &addr ), sizeof( addr ) );
if( error == -1 )
{
fprintf(stderr, "%s: ERROR %d binding to %s:%d\n",
__FUNCTION__, WSAGetLastError(), sInterfaceIP, nInterfacePort);
}
//Join multicast group
struct ip_mreq mreq;
mreq.imr_multiaddr.s_addr = inet_addr("225.2.3.13");//( "233.43.202.1" );
mreq.imr_interface.s_addr = inet_addr("10.20.16.90");
//error = setsockopt( m_socketID, IPPROTO_IP, IP_ADD_MEMBERSHIP, reinterpret_cast<char*>( &mreq ), sizeof( mreq ) );
if (error == -1)
{
fprintf(stderr, "%s: ERROR %d trying to join group %s.\n", __FUNCTION__, WSAGetLastError(), "233.43.202.1" );
}
int bufSize = 0, len = sizeof(bufSize), nBufferSize = 10*1024*1024;//8192*1024;
//Resize the buffer
getsockopt(m_socketID, SOL_SOCKET, SO_RCVBUF, (char*)&bufSize, &len );
fprintf(stderr, "getsockopt size before %d\n", bufSize );
fprintf(stderr, "setting buffer size %d\n", nBufferSize );
error = setsockopt(m_socketID, SOL_SOCKET, SO_RCVBUF,
reinterpret_cast<const char*>( &nBufferSize ), sizeof( nBufferSize ) );
if( error != 0 )
{
fprintf(stderr, "%s: ERROR %d setting the receive buffer size to %d.\n",
__FUNCTION__, WSAGetLastError(), nBufferSize );
}
bufSize = 1234, len = sizeof(bufSize);
getsockopt(m_socketID, SOL_SOCKET, SO_RCVBUF, (char*)&bufSize, &len );
fprintf(stderr, "getsockopt size after %d\n", bufSize );
//Non-blocking
u_long op = 1;
ioctlsocket( m_socketID, FIONBIO, &op );
//Create IOCP
HANDLE iocp = CreateIoCompletionPort( INVALID_HANDLE_VALUE, NULL, NULL, 1 );
HANDLE iocp2 = CreateIoCompletionPort( (HANDLE)m_socketID, iocp, 5, 1 );
char buffer[2*1024]={0};
int r = 0;
OVERLAPPED overlapped;
memset(&overlapped, 0, sizeof(overlapped));
DWORD bytes = 0, flags = 0;
// WSABUF buffers[1];
//
// buffers[0].buf = buffer;
// buffers[0].len = sizeof(buffer);
//
// while( (r = WSARecv( m_socketID, buffers, 1, &bytes, &flags, &overlapped, NULL )) != -121 )
//sleep(100000);
while( (r = ReadFile( (HANDLE)m_socketID, buffer, sizeof(buffer), NULL, &overlapped )) != -121 )
{
bytes = 0;
ULONG_PTR key = 0;
LPOVERLAPPED pOverlapped;
if( GetQueuedCompletionStatus( iocp, &bytes, &key, &pOverlapped, INFINITE ) )
{
static unsigned __int64 total = 0, printed = 0;
total += bytes;
if( total - printed > (1024*1024) )
{
printf( "%I64dmb\r", printed/ (1024*1024) );
printed = total;
}
}
}
while( r = recv(m_socketID,buffer,sizeof(buffer),0) )
{
static unsigned int total = 0, printed = 0;
if( r > 0 )
{
total += r;
if( total - printed > (1024*1024) )
{
printf( "%dmb\r", printed/ (1024*1024) );
printed = total;
}
}
}
return 0;
}
I am using Iperf as the sender and comparing the amount of data received to the amount of data sent: iperf.exe -c 10.20.16.90 -u -P 10 -B 10.20.16.51 -b 1000000000 -p 12400 -l 1000
edit: doing iperf to iperf the performance is closer to 180k or so without dropping (8mb client side buffer). If I am doing tcp I can do about 200k packets/second. Here's what interesting though - I can do far more than 200k with multiple tcp connections, but multiple udp connections do not increase the total (I test udp performance with multiple iperfs, since a single iperf with multiple threads doesn't seem to work). All hardware acceleration is tuned on in the drivers.. It seems like udp performance is simply subpar?
I've been doing some UDP testing with similar hardware as I investigate the performance gains that can be had from using the Winsock Registered I/O network extensions, RIO, in Windows 8 Server. For this I've been running tests on Windows Server 2008 R2 and on Windows Server 8.
I've yet to get to the point where I've begun testing with our 10Gb cards (they've only just arrived) but the results of my earlier tests and the example programs used to run them can be found here on my blog.
One thing that I might suggest is that with a simple test like the one you show where there's very little work being done to each datagram you may find that old fashioned, synchronous I/O, is faster than the IOCP design. Whilst the IOCP design steps ahead as the
workload per datagram rises and you can fully utilise the multiple threads.
Also, are your test machines wired back to back (i.e. without a switch) or do they run through a switch; if so, could the issue be down to the performance of your switch rather than your test machines? If you're using a switch, or have multiple nics in the server, can you run multiple clients against the server, could the issue be on the client rather than the server?
What CPU usage are you seeing on the sending and receiving machines? Have you looked at the machine's cpu usage with Process Explorer? This is more accurate than Task Manager. Which CPU is handling the nic interrupts, can you improve things by binding these to another cpu? or changing the affinity of your test program to run on another cpu? Is your IOCP example spreading its threads across multiple NUMA nodes or are you locking all of them to one node?
I'm hoping to get to run some more tests next week and will update my answer when I have done so.
Edit: For me the problem was due to the fact that the NIC drivers had "flow control" enabled and this caused the sender to run at the speed of the receiver. This had some undesirable "non-paged pool" usage characteristics and turning off flow control allows you to see how fast the sender can go (and the difference in network utilisation between the sender and receiver clearly shows how much data is being lost). See my blog posting here for more details.

Resources