TCP socket 'deleted' every 6 minutes on Windows Server 2012 R2 - windows

We have to migrate a System with our software from a Windows Server 2003 to a Windows Server 2012 R2. At this project we just changed the server hardware (to a HP ProLiant Server), the OS and the ISDN card with the CAPI driver. On this server there is a C++ application which filters a 30 byte character string out of the ISDN D-Channel and send it over a TCP socket (localhost, port 30000) to a JAVA application. The message comes every 30 seconds and has always the same format.
The problem is: Every 6 minutes the TCP socket is getting deleted/cleared/doesn't work. Both applications log the broken communication in their log files, build and open the socket again and the game goes on without any problems for another 6 minutes.
At the old system, this software works for years without any problems on Windows Server 2003 on 9 sites.
What we've already done without any positive effect:
deactivate the firewall completely
change the port to different ones (30001, 30500, 16000, 997)
use the own IP (10.16.58.30) instead on localhost
put several timeouts to the TCP Parameters at the registry (e.g. KeepAliveTime)
update JAVA to the last version
install all recommended updates for Windows Server 2012 R2
strip both applications down to just the socket to ensure that the software itself hasn’t any problem
using a standard message ('1234567891234567890') instead of the incoming ISDN message to exclude malfunction from strange input data
checking the message length to exclude a length of 0
checking all buffers on both sides to exclude buffer overflow
checking if any other software on the server is using 'our' port
The problem doesn't appear, if we're sending the messages from outside manual or in different message cycles with a bash script to the server port of the JAVA app.
We are now thinking that this can only be some kind of checking mechanism of the operating system that forces our socket to stop communication. Any suggestions, what that could be?

Related

Server in Windows doesn't respond to Websockets connection - NBSTAT <00>

I have a (Spring Boot) server, and a client trying to establish a Websockets connection to the server.
I've run and tested the server on a Linux machine (Ubuntu 20.04), it works fine.
It also ran fine on my Windows (Windows 10 Home) machine up until a few days ago. Now it is acting strange.
I checked the network traffic between client and server in wireshark, both in Linux and Windows.
Here is the linux capture:
And this is the windows capture:
The blacked out IPs are the client's. Both the linux and windows servers are running in the same network, so the problem would not be in a router configuration.
In both cases, the client makes the same request to /location/websocket, but in Linux the server responds successfully in less than 1 second, while in Windows it responds about 13 seconds later, and immediately follows the response by closing the websocket connection.
What looks strange to me are the NBSTAT name queries. I tried several times and there are always three queries between the arrival of the client request, and the closing of the websocket connection.
So maybe the windows machine needs to do a name query to respond successfully? Is this normal? What does the <00>...<00> string in the name query mean? I checked the network traffic while keeping the server up but not connecting to the client, and I didn't see any activity on port 137, so it definitely only happens when the client tries to contact the windows machine. What can I do about this? How can I get the server responding to websocket connections again?

gRPC stopped working with APO on Windows 11

I have a Windows application (APP) and Audio Processing Object (APO) loaded by AudioDG.exe that communicate via gRPC:
APP part that is written in C# creates server via Grpc.Core.
APO part creates client via grpc++.
Server is on 127.0.0.1:20000 (I can see it's up and listening with netstat -ano).
I can confirm that APO is loaded into audio device graph by inspecting it with process explorer.
Everything worked like a charm on Windows 8 and 10, but on 11 it cannot communicate at all - I get either Error Code 14, Unavailable, failed to connect to all addresses or 4, Deadline Exceeded.
After enabling debug traces, I now see "socket is null" description for "connect failed" error:
I0207 16:20:59.916447 0 ..\..\..\src\core\ext\filters\client_channel\subchannel.cc:950: subchannel 000001D8B9B01E20 {address=ipv4:127.0.0.1:10000, args=grpc.client_channel_factory=0x1d8bb660460, grpc.default_authority=127.0.0.1:10000, grpc.internal.subchannel_pool=0x1d8b8c291b0, grpc.primary_user_agent=grpc-csharp/2.43.0 (.NET Framework 4.8.4470.0; CLR 4.0.30319.42000; net45; x64), grpc.resource_quota=0x1d8b8c28d90, grpc.server_uri=dns:///127.0.0.1:10000}: connect failed: {"created":"#1644240059.916000000","description":"socket is null","file":"..\..\..\src\core\lib\iomgr\tcp_client_windows.cc","file_line":112}
What I've tried so far:
Updating both parts to the latest grpc versions.
Using "no proxy", "Http2UnencryptedSupport" and other env variables.
Using "localhost" or "0.0.0.0" instead of "127.0.0.1".
Updating connection to use self signed SSL certificates (root CA, server cert + key, client cert + key).
Adding inbound / outbound rules for my port, and then disabling firewall completely.
Creating server on APO side and trying to connect with the client in APP.
Everything works (both insecure and SSL creds) if I create both client and server in C# part, but as soon as it's APP-APO communication it feels blocked or sandboxed.
What has been changed in Windows 11 that can "block" gRPC?
Thanks in advance!
In your input you write:
Server is at 127.0.0.1:20000
Further looking at the logs, you can see that:
The server is located at
grpc.server_uri=dns:///127.0.0.1:10000
Based on the question posed and the amount of data provided, I would check which port the server is really using and which port the client is looking for a connection on.
The easiest way to do this is to use the built-in Resource Monitor application. On the Network tab, in the TCP Connections list, you can find the application and the port it uses.
You can also use the PowerShell command
Test-NetConnection -Port 10000 -InformationLevel "Detailed"
Test-NetConnection -Port 20000 -InformationLevel "Detailed"
At least this is the first thing I would check based on what you described.
Regarding your question about the changes in Windows 11, I do not think that this is something that's causing problems for you. However, Windows 11 has additional security features compared to Windows 10, try disabling the security features completely as a test. Perhaps this will help solve the problem.
As for ASP.NET Core 6.0 itself (if I understood the version correctly), then there is a possibility that the server part, working not in the sandbox of the programming environment, still does not accept the client certificate. At the program level, you can try to fix this by adding the following exception to the code:
// This switch must be set before creating the GrpcChannel/HttpClient.
AppContext.SetSwitch(
"System.Net.Http.SocketsHttpHandler.Http2UnencryptedSupport", true);
// The port number(5000) must match the port of the gRPC server.
var channel = GrpcChannel.ForAddress("http://localhost:5000");
var client = new Greet.GreeterClient(channel);
More troubleshooting issues with ASP.NET Core 6.0 Microsoft described in detail here.
https://learn.microsoft.com/en-us/aspnet/core/grpc/troubleshoot?view=aspnetcore-6.0
I hope it was useful and at least one of the solutions I suggested will help solve your problem. In any case, if I had more information, I think I could help you more accurately.

Websocket server stops accepting after ~600 connections

I'm running a websocket server (command line program) off port 9000 on a Windows 2008 server. I can't seem to figure out why it will not accept more than about 600 concurrent connections. Testing on my local machine, I can create thousands of concurrent connections. But on the server, I get the following error after about 600:
No connection could be made because the target machine actively refused it
I have tried adjusting registry entries for the max port number, and turning off the firewall to no avail. I have also tried a different websocket server implementation. Is there some other setting I need to change?
edit: I tried this on a Linux server as well with the same problem.
I found the problem:
It seems to be my client side internet connection. By running the same tests on a different network from the client side, I can create thousands of connections.

Subversion unbearably slow on Windows 7

My company is currently using TortoiseSVN 1.6.16 32-bit on Windows XP to connect via HTTPS to a VisualSVN-Server 2.1.19 running on a Windows Server 2003 residing in the same network (no proxy). We use a self-signed certificate and Kerberos authentication using windows credentials (I suppose this is a VisualSVN-specific feature). In this setup, everything works dandy.
When my company decided to move on to Windows 7, we tried TortoiseSVN 1.7.6 64-bit on Windows 7 64-bit which resulted in the following problem:
Any operation involving the server (repo-browser, checkout, update, checkin, ...) is unbearably slow e.g.
opening the repo-browser (10 projects): 15 min
update on a fresh checkout of 50 files: 1 min
checkin of a single empty file: 30 sec
Tortoise shows alternatively normal transmission speeds and 0 byte/s. Many small files seem to be slower than a few big ones.
The slow connection results in various failures when using neon as http-lib (serf is still slow, but operation finishes successfully without errors)
EasySVN, SmartSVN and the SVN command line client that comes with TortoiseSVN show the same behaviour. Same with TortoiseSVN 1.6.16 64-bit.
Changing the server protocol to HTTP (no SSL) does not improve the situation
On the other hand
TortoiseSVN 1.7.6 32-bit on Windows XP works fine with our server
Access via browser/WebDAV works well even under Windows 7
Server side logs do not show errors or even warnings
I found several posts which also complained about slow behaviour on Windows 7, but they didn't fit my bill because they were local operations or were restricted to TortoiseSVN.
As there is no indication that there is a general problem with Subversion on Windows 7, I suspect that it could be our OS' networking parameters or protocol versions. Are there any parameters which are known to influence Subversion's performance?
I have to admit I am not familiar with how exactly Subversion (or rather neon/serf) relies on the OS and on which parts. Any information on that would be greatly appreciated.
Are there any parameters in the subversion 'servers' file which I should test? How would you consider my chances that Wireshark'ing the connection will help me?
Similar experiences, opinions, hints, help and straws are welcome.
Wireshark shows sporadic gaps of ca. 5 sec in the TCP stream apparently caused by VisualSVN Server.
https: the server acknowledges the client hello then waits for 5 secs before sending its server hello
https: the server acknowledges the client key and than takes 5 secs before supplying its encrypted handshake data
https: even outside the handshake, server sometimes sends an ACK (on TCP level) and then waits for 5 sec before sending something back to the client (the data is encrypted so it's hard to tell whether the break occurs at some point of interest)
http: at both server side transmissions during the NTLM authentication
http: before server sending a FIN flag
A typical fail with Windows 7 against an older server is IPv6 networking.
If your machine does not have an SVN server listening on an IPv6 address Windows 7 might still try to do a TCP6 connect first (you can see it in Process Explorer if you look at the open sockets of the TortoiseSVN process while trying an operation), this has a timeout of a few seconds and then retries with IPv4.
Simple solutions are either upgrade your server to an IPv6 capable one or disable IPv6 for the Windows 7 clients.
Another thing you could verify (the answer above didn't work for us) is the Internet Explorer settings especially if you have IE9. We found that by disabling the option Automatically detect settings in the Internet Options -> Connection tab -> LAN settings, SVN started working normally again.
The issue was never properly cleared up. Most probably, the company internal network path between the client and the server was somehow at fault. The matter became obsolete when we moved the SVN server to another machine. The very same setup of server and clients works fine now, even with Windows 7.
I had the same symptom of a very slow repository browse, slow updates, slow everything.
My SVN server has two Ethernet cards, so it has two Ethernet IP addresses. The SVN server was only listening on one of the IP addresses. So a name resolution via WINS or NetBIOS could resolve to the 'wrong' IP address.
TortoiseSVN would retry, eventually the name resolution would find the 'correct' IP address, and things would work.

TCP: Address already in use exception - possible causes for client port? NO PORT EXHAUSTION

stupid problem. I get those from a client connecting to a server. Sadly, the setup is complicated making debugging complex - and we run out of options.
The environment:
*Client/Server system, both running on the same machine. The client is actually a service doing some database manipulation at specific times.
* The cnonection comes from C# going through OleDb to an EasySoft JDBC driver to a custom written JDBC server that then hosts logic in C++. Yeah, compelx - but the third party supplier decided to expose the extension mechanisms for their server through a JDBC interface. Not a lot can be done here ;)
The Symptom:
At (ir)regular intervals we get a "Address already in use: connect" told from the JDBC driver. They seem to come from one particular service we run.
Now, I did read all the stuff about port exhaustion. This is why we have a little tool running now that counts ports and their states every minute. Last time this happened, we had an astonishing 370 ports in use, with the count rising to about 900 AFTER the error. We aleady patched the registry (it is a windows machine) to allow more than the 5000 client ports standard, but even then, we are far far from that limit to start with.
Which is why I am asking here. Ayneone an ide what ELSE could cause this?
It is a Windows 2003 Server machine, 64 bit. The only other thing I can see that may cause it (but this functionality is supposedly disabled) is Symantec Endpoint Protection that is installed on the server - and being capable of actinc as a firewall, it could possibly intercept network traffic. I dont want to open a can of worms by pointing to Symantec prematurely (if pointing to Symantec can ever be seen as such). So, anyone an idea what else may be the cause?
Thanks
"Address already in use", aka WSAEADDRINUSE (10048), means that when the client socket prepared to connect to the server socket, it first tried to bind itself to a specific local IP/Port pair that was already in use by another socket, either an active one or one that has been closed but is still in the FD_WAIT state. This has nothing to do with the number of ports that are available.
I'm having the same issue on a Windows 2000 Server with a .Net application connecting to a SQL Server 7.0. There's like 10 servers with the same configuration and only one is showing this error several times a day. With a small test program I'm able to reproduce the error by just establishing a TCP connection on the SQL Server listening port. Running CurrPorts (http://www.nirsoft.net/utils/cports.html) shows there's still plenty of available ports in range 1024-5000.
I'm out of ideas and would like to know if you've found a solution since you've posted your question.
Edit : I finally found the solution : a worm was present on the server (WORM_DOWNAD.A) and exhausted local ports without being noticed.

Resources