I'm running a websocket server (command line program) off port 9000 on a Windows 2008 server. I can't seem to figure out why it will not accept more than about 600 concurrent connections. Testing on my local machine, I can create thousands of concurrent connections. But on the server, I get the following error after about 600:
No connection could be made because the target machine actively refused it
I have tried adjusting registry entries for the max port number, and turning off the firewall to no avail. I have also tried a different websocket server implementation. Is there some other setting I need to change?
edit: I tried this on a Linux server as well with the same problem.
I found the problem:
It seems to be my client side internet connection. By running the same tests on a different network from the client side, I can create thousands of connections.
Related
I have a (Spring Boot) server, and a client trying to establish a Websockets connection to the server.
I've run and tested the server on a Linux machine (Ubuntu 20.04), it works fine.
It also ran fine on my Windows (Windows 10 Home) machine up until a few days ago. Now it is acting strange.
I checked the network traffic between client and server in wireshark, both in Linux and Windows.
Here is the linux capture:
And this is the windows capture:
The blacked out IPs are the client's. Both the linux and windows servers are running in the same network, so the problem would not be in a router configuration.
In both cases, the client makes the same request to /location/websocket, but in Linux the server responds successfully in less than 1 second, while in Windows it responds about 13 seconds later, and immediately follows the response by closing the websocket connection.
What looks strange to me are the NBSTAT name queries. I tried several times and there are always three queries between the arrival of the client request, and the closing of the websocket connection.
So maybe the windows machine needs to do a name query to respond successfully? Is this normal? What does the <00>...<00> string in the name query mean? I checked the network traffic while keeping the server up but not connecting to the client, and I didn't see any activity on port 137, so it definitely only happens when the client tries to contact the windows machine. What can I do about this? How can I get the server responding to websocket connections again?
Running Postgresql 9.5 on a windows server 2012 R2 in Azure
While running some loadtests on my application, I get errors on not being able to connect to the postgres server. In the logs of postgres I get the following message:
could not receive data from client: No connection could be made
because the target machine actively refused it.
This only happens when the loadtest goes to the next scenario, hitting a different part of the code. So new connections to the database are required. But after 10-20 seconds the rest of the scenario works flawlessly without hitting any other hiccups. So the problem seems to be the tcp connections. (My code retries a couple of times but it is not feasible to let it retry for 20 seconds)
I'm using the following settings in the config files
postgresql.conf
listen_addresses = '*'
max_connections = 500
shared_buffers = 1024MB
temp_buffers = 2MB
work_mem = 2MB
maintenance_work_mem = 128MB
pg_hba.conf
host all all 0.0.0.0/0 trust
host all all ::/0 trust
I know, I know.. It is not save to accept connections from everyone, but this is just for testing purposes and to make sure these settings are not blocking any connection. So this answer is void
I've been monitoring the number of connection on the server and under the load it is stable at 75. Postgres is using around 350mb of RAM. So given the config and the vm specs (7gb ram) there should be plenty of space to create more connections. However when the next scenario is spinning up the number of connections does not increase, it stays level and starts giving these log messages about no connection could be made.
What could be the problem here?
It does sound like this isn't really a Postgres problem (hence no changes in DB stats you're checking), rather that the traffic is being stopped by the server. Possibly because traffic on that port is saturated while handling your load testing queries?
It doesn't sound like you're hitting any of the Azure resource limits (including the database limits if that applies to your setup?), but without more detail on your load tests it's hard to say exactly what is needed.
Solutions from around the web and other SO answers suggest:
Disable TCP autotuning and tweak the TCP/IP registry keys on the server, e.g. set TcpAckFrequency - see this article for details
Make TCP setting adjustments (like WinsockListenBacklog) - which may be affected by whether connection pooling is in use or not - see this MS support article, which is for SQL Server 2005 but has some great tips on troubleshooting rejected TCP/IP connections (using Network Monitor, but applies to newer tools)
Faster request processing if you have enough control of the server - source
Disabling network proxying (in your load testing app): <defaultProxy> <proxy usesystemdefault="False"/> </defaultProxy> - source
Most possible reason is a Firewall/Anti-virus:
Software/Personal Firewall Settings
Multiple Software/Personal Firewalls
Anti-virus Software
LSP Layer
(Virtual) Router Firmware
Does your current Azure infrastructure contain Firewall or Anti-virus ?
Additionally on doing some additional searches, it looks like this is a standard Windows "connection refused" message, which suggests that PostgreSQL is trying to connect to something and being refused.
Also possible that one network element in your network - assuming that you are still connected to the server - will delay or drop somes DB login/authentication network packets (considered for example as a fake auth.replay) ...
You may also use a packet analyzer (like Wireshark) to record/inspect network flow when the error appear.
Regards
I was facing the same issue in my AspNet core application while I was trying to connect the Postgresql from my application. The error was thrown in the Program.cs file when I was calling the Migrate function.
public static void Main(string[] args) {
try {
var host = BuildWebHost(args);
using(var scope = host.Services.CreateScope()) {
// Migrate once after app is started.
scope.ServiceProvider.GetService <MyDatabaseContext>().Migrate();
}
host.Run();
}
catch(Exception e) {
//NLog: catch setup errors
_logger ? .Error(e, "Stopped program because of exception: ");
throw;
}
}
To fix this problem I did the following steps.
Check whether the Postgresql service is running by going to the services.msc
Tried to login to the pgAdmin with the user and password I provided in the database context
Everything was file, and as you know that 5432 is the default port of Postgresql and somehow I was using a different port in my application connection string, changing it to 5432 fixed this issue for me.
"ConnectionString": "User Id=postgres;Password=mypwd;Host=localhost;Port=5432;Database=mydb;"
I came across a similar issue whilst trying to beast my api, where I was seeing Npgsql.NpgsqlException No connection could be made because the target machine actively refused it..
However my issue was was down to the fact that I was re-creating my NpgsqlConnection for each query rather than re-using and keeping it alive.
I am trying to run ftp client and server.
The connection successful only if both server and client windows firewall is turned off.
I tried to turn the firewall on and allow roll for:
1. (inbound and outbound) tcp port 20-21 allow
2. allow end point communication for server-client
However, my problem still not solved.
Anyone has any other ideas?
Thanks ahead
Fixed it. The server throw an exception. I found it when i created log file on both server and client side
I am trying to find somw Windows based tools that can help me validate TCP and UDP connection on remote machines.
My Problem (just one use case):
At work, I manage many clustered servers that I run load tests against. In order to get a rich test, I use Jmeter-Plugins which provides a Server agent that opens a TCP socket on port 4444 on a target remote machine: http://code.google.com/p/jmeter-plugins/wiki/PerfMonAgent
There are many times when I setup a new load test farm, that either the network, or the server configuration, or the ServerAgent itself can have issues and thus not allowing a Load test client to access that TCP connection.
The issue I have is that I dont know what part of the system is broken.
What I think I need:
I would like to know how I can open a TCP (not HTTP with cUrl), connection to a remote server to validate that the network allows the connection, as well as the Server firewall allows the given TCP connection to be accessed remotely.
What I have looked:
These are some of the tools I have looked at so far:
Nmap http://nmap.org
Ncat http://sourceforge.net/projects/nmap-ncat/
TCP/IP Builder http://www.drk.com.ar
Zenmap 6.01 and nmap might do the job I want, but some machines where not accessible to Zenmap when I know 100% that the server was accessible via HTTP, so that was strange.
I have looked at many tools and either they:
Dont allow remote connections
Dont seem to want to connect to a TCP socket
Or I dont understand the tools to accomplish the validation I stated above.
I would greatly appreciate all comment and suggestions to help with this re-occurring problem I face.
Mick,
Firebind.com can do what you'd like to do. Firebind is an Internet based server that can listen on any of the 65535 UDP or TCP ports. It uses a java based client to send traffic to and from the server from your machine.
Carl
www.firebind.com
I'm using beanstalkd to offload some work to other machines. The setup is a bit unusual, the server is on the internet (public ip) but the consumers are behind adsl lines on some peoples homes. So there is a linux server as client going out through a dynamic ip and connecting to the server to get a job. It's all PHP and I'm using pheanstalk library.
Everything runs smoothly for some time, but then the adsl changes the IP (every 24h hours the provider forces a disconnect-reconnect) the client just hangs, never to go out of "reserve".
I thought that putting a timeout on the reserve would help it, but it didn't. As it seems, the client issues a command and blocks, it never checks the timeout. It just issues a reserve-with-timeout (instead of a simple reserve) and it is the servers responsibility to return a TIME_OUT as the timeout occurs. The problem is, the connection is broken (but the TCP/IP doesn't know about that yet until any of the sides try to talk to the other side) and if the client blocked reading, it will never return.
The library seems to have support for some kind of timeouts locally (for example when trying to connect to server), but it does not seem to contemplate this scenario.
How could I detect the stale connection and force a reconnect? Is there some kind of keepalive on the protocol (and on the pheanstalk itself)?
Thanks!
You could try to close each connection right after the request is answered and reopen a new connection each time.
There is no close() function but you deleting the Pheanstaly Object with unset($pheanstalk) will close it.
This explanation is quite helpful:
Pheanstalk (PHP client for beanstalk) - how do connections work?
I haven't tried it yet, but I came up with the idea of connecting to the beanstalk server through an SSH tunnel. We can enable the ServerAliveCountMax and ServerAliveInterval options on the tunnel, so that a network or server failure will cause the tunnel to close. This should then cause the pheanstalk client to report an error.