ora-12528: TNS:Listener: All Appropriate instances are blocking new connections - oracle

I am getting this error when I try to connect to my database:
ora-12528: TNS:Listener: All Appropriate instances are blocking new connections
I tried the following, with no success:
Stop and Start the Listener.
Shutdown and Startup database.
Restart the oracle services.
How might I resolve this?

You might have a problem with either the network and/or the archive logs - the above usually happens when the area/disk where the archive logs are stored is full, Oracle then just refuses new connections.
Another possibility is that you maxed out the number of allowed connections - this should usually be warning sign that you might have an application which leaks connections.
If you are 100% sure that you are not leaking connections then you could configure Oracle to accept more connections (BEWARE of licensing, RAM etc.!).

Related

TCPListener in golang: error when number of connections is above 60 / 260

I am building TCP Proxy: client <-> proxy <-> Vertica
I have a net.TCPListener, which takes incoming requests by AcceptTCP() and creating connections, then, making connection to destination socket by net.DialTCP("tcp", nil, raddr). Looks like a bridge. Default proxy model.
Firstly, at first version, i have a trouble: if i have 59 parallel incoming request, everything is fine. But if i have one more (60), i have a trouble: 1-59 connections are OK, but 60 and newer are fault. I cant catch error properly. Looks like some socket unexpectedly closes
Secondly, i tried to set queue for listener. It helps me a lot: but if i have more than 258 requests, i get error again.
My question: is there any limit of connections in net package? May be it is system limitation?
For external info: Vertica running in docker container, hw/system: macbook, vertica limit connection pool: 5, but pool logic implemented into proxy.
I also tried set "raw" proxy without pool logic (thats why i set queue for listener: i must not exceed threshold of Vertica User's pool), result is 258 requests..
UPDATED: (05.04.2020)
Looks like it is system limitations fault. Did I mention anywhere that I trying to run the whole system on one PC?
So, what I had:
300 parallel processes as requests (making by multiprocessing.Pool
Python) (300 sockets)
Listener that creates 300 connections (once
more 300 sockets)
And series of rapidly creating/closing sockets in
deep of proxy (according to queue and Vertica pool)
What I have now:
300 python requests making from another PC in my local network (on Windows)
Proxy works fine
But I have several errors on Windows PC, which creating requests to my proxy. Errors like low memory in "swap file".
I still need to make some stress test for proxy. Adding less memory for swap file didn't solve my problem on Windows PC. I will be grateful for any suggestions and ideas. Thanks!
How does the proxy connect to Vertica?
There is by default a maximum of 50 ordinary mortal users to be connected to one Vertica node at any one time. The superuser "dbadmin" always has 5 connections in addition to that.
So if I try to connect 60 times as dbadmin, I get this on a default Vertica configuration:
Connection attempt failed: FATAL 4060: New session rejected due to limit, already 55 sessions active
You can increase the Vertica config item MaxClientSessions from its default of 50 per node.
Command is : ALTER NODE <_node_name_> SET MaxClientSessions = 100, for example.
I suppose you are always connecting to the same Vertica node, and that you have set ConnectionLoadBalancing to FALSE. So you always connect to the same node, and soon reach the default maximum of 50.
Hope that's the reason found ....

Postgresql: No connection could be made because the target machine actively refused it

Running Postgresql 9.5 on a windows server 2012 R2 in Azure
While running some loadtests on my application, I get errors on not being able to connect to the postgres server. In the logs of postgres I get the following message:
could not receive data from client: No connection could be made
because the target machine actively refused it.
This only happens when the loadtest goes to the next scenario, hitting a different part of the code. So new connections to the database are required. But after 10-20 seconds the rest of the scenario works flawlessly without hitting any other hiccups. So the problem seems to be the tcp connections. (My code retries a couple of times but it is not feasible to let it retry for 20 seconds)
I'm using the following settings in the config files
postgresql.conf
listen_addresses = '*'
max_connections = 500
shared_buffers = 1024MB
temp_buffers = 2MB
work_mem = 2MB
maintenance_work_mem = 128MB
pg_hba.conf
host all all 0.0.0.0/0 trust
host all all ::/0 trust
I know, I know.. It is not save to accept connections from everyone, but this is just for testing purposes and to make sure these settings are not blocking any connection. So this answer is void
I've been monitoring the number of connection on the server and under the load it is stable at 75. Postgres is using around 350mb of RAM. So given the config and the vm specs (7gb ram) there should be plenty of space to create more connections. However when the next scenario is spinning up the number of connections does not increase, it stays level and starts giving these log messages about no connection could be made.
What could be the problem here?
It does sound like this isn't really a Postgres problem (hence no changes in DB stats you're checking), rather that the traffic is being stopped by the server. Possibly because traffic on that port is saturated while handling your load testing queries?
It doesn't sound like you're hitting any of the Azure resource limits (including the database limits if that applies to your setup?), but without more detail on your load tests it's hard to say exactly what is needed.
Solutions from around the web and other SO answers suggest:
Disable TCP autotuning and tweak the TCP/IP registry keys on the server, e.g. set TcpAckFrequency - see this article for details
Make TCP setting adjustments (like WinsockListenBacklog) - which may be affected by whether connection pooling is in use or not - see this MS support article, which is for SQL Server 2005 but has some great tips on troubleshooting rejected TCP/IP connections (using Network Monitor, but applies to newer tools)
Faster request processing if you have enough control of the server - source
Disabling network proxying (in your load testing app): <defaultProxy> <proxy usesystemdefault="False"/> </defaultProxy> - source
Most possible reason is a Firewall/Anti-virus:
Software/Personal Firewall Settings
Multiple Software/Personal Firewalls
Anti-virus Software
LSP Layer
(Virtual) Router Firmware
Does your current Azure infrastructure contain Firewall or Anti-virus ?
Additionally on doing some additional searches, it looks like this is a standard Windows "connection refused" message, which suggests that PostgreSQL is trying to connect to something and being refused.
Also possible that one network element in your network - assuming that you are still connected to the server - will delay or drop somes DB login/authentication network packets (considered for example as a fake auth.replay) ...
You may also use a packet analyzer (like Wireshark) to record/inspect network flow when the error appear.
Regards
I was facing the same issue in my AspNet core application while I was trying to connect the Postgresql from my application. The error was thrown in the Program.cs file when I was calling the Migrate function.
public static void Main(string[] args) {
try {
var host = BuildWebHost(args);
using(var scope = host.Services.CreateScope()) {
// Migrate once after app is started.
scope.ServiceProvider.GetService <MyDatabaseContext>().Migrate();
}
host.Run();
}
catch(Exception e) {
//NLog: catch setup errors
_logger ? .Error(e, "Stopped program because of exception: ");
throw;
}
}
To fix this problem I did the following steps.
Check whether the Postgresql service is running by going to the services.msc
Tried to login to the pgAdmin with the user and password I provided in the database context
Everything was file, and as you know that 5432 is the default port of Postgresql and somehow I was using a different port in my application connection string, changing it to 5432 fixed this issue for me.
"ConnectionString": "User Id=postgres;Password=mypwd;Host=localhost;Port=5432;Database=mydb;"
I came across a similar issue whilst trying to beast my api, where I was seeing Npgsql.NpgsqlException No connection could be made because the target machine actively refused it..
However my issue was was down to the fact that I was re-creating my NpgsqlConnection for each query rather than re-using and keeping it alive.

Connection pool opens more connections then maximum pool size

Hey I'm using Glassfish open source v4 and I'm having a weird problem.
I have defined a JDBC connection pool to Oracle 11g in the admin console and I've set :
Pool Settings
Initial and Minimum Pool Size: 500
Maximum Pool Size: 1000
Pool Resize Quantity: : 750
And I've created a specific user for this connection pool. Yet sometimes when I inspect opened connections in the database I see that there are more then 1000 (maximum I've seen was 1440)
When this happens any query attempts fail, sometimes with OutOfMemory exception, some show http thread interuptions and some don't show any logs at all, just takes a long time.
What I am wondering is how is it possible the Glassfish opens more connections then I've defined it to?
1t try to compare output from netstat on appl. server and db server side. You may have some "dangling" connections. Also try to find some documentation about DCD (Dead connection detection) in Oracle.
Few years ago I saw situations where Java application server thought that the connection is dead because it is not responding for few minutes. So this connection was put onto some dead connection list and a new connection was created.
There also can be some network issues - for example there is a FW between appl and db server.
When TCP connection is not active for one hour then it's cut over on one side but DB sever does not know about that.
The usual way how to investigate that is
compare output of both netstat(s) (appl./db)
identify dangling TCP connections
translate TCP connection onto Unix process id(PID) of Oracle session process
translate PID onto Oracle session (SID and SERIAL#)
kill the session on Oracle level (alter system kill session ...)

How to validate Oracle's ValidateConnection Property is working?

Someone told me if I set the "ValidateConnection" property in Oracle to TRUE, the application will be able to handle the following cases:
Timeouts on network equipment that shutdown TCP connections after a certain
amount of time and/or inactivity.
Physical connection breaks such as pulled cables, network equipment resets,
etc.
Oracle server being restarted, or DBA logically closing the connection on
the server side.
My questions are:
If ValidateConnection is set to TRUE, can oracle actually handle the above cases?
Do I need to write additional code or Oracle's connection pool will just wait until the connection is timedout?
What technique or tools can I use to test this cases? Sample code, or link to other article will be very useful.
Thanks.
ValidateConnection simply tells Oracle to test the connection from the pool prior to handing it to the application. This prevents you from getting already disconnected connections from the connection pool. To answer your question of which situations are handled by using ValidateConnection, I guess I would need to know what you mean by "handle". If the Oracle server has been disconnected from the internet, ValidateConnection cannot do anything about it. However, once it is back online, ValidateConnection will prevent Oracle from handing your application disconnected connections from the connection pool. The link below gives some a little more information, and he shortly describes how he tested ValidateConnection in his environment.
http://spdeveloper.net/2009/10/disconnected-odp-net-and-system-data-oracleclient-connections/

What is the difference between "ORA-12571: TNS packet writer failure" and "ORA-03135: connection lost contact"?

I am working in an environment where we get production issues from time to time related to Oracle connections. We use ODP.NET from ASP.NET applications, and we suspect the firewall closes connections that have been in the connection pool too long.
Sometimes we get an "ORA-12571: TNS packet writer failure" error, and sometimes we get "ORA-03135: connection lost contact."
I was wondering if someone has run into this and/or has an understanding of the difference between the 2 errors.
Using a mobile phone analogy:
ORA-12571 (Failure) Means call is dropped.
ORA-03135 (Connection Lost) Other party hung up.
My understanding is that 3135 occurs when a connection is lost. This doesn't tell you why the connection was lost, though. It may have been terminated by the server because the server failed to recieve a response to a probe for a certain amount of time, and assumed that the connection was dead. Or (I'm not sure about this) the exact reverse of that: the client failed to recieve a probe response from the server for a certain amount of time, so it assumed the connection was lost. The "certain amount of time" is cotrolled by SQLNET.EXPIRE_TIME=[minutes] in sqlnet.ora.
As for 12571, my (again vague) understanding is that there was a sudden failure to send a packet during communication with the server, and that this is typically caused by some software or hardware interfering with the connection (either by design, or by error). For instance, if you pull out your ethernet cable and then try to execute a query, you'll probably get this. Or if a firewall or anti-malware application decides to block the traffic.

Resources