ORDS was unable to make a connection to the database - oracle

I am seeing this issue frequently after I have implemented the ORDS in R12.2.9 upgrade. Our ORDS is hosted on a weblogic server this issue occurs when there are 10 connections updating a single table? Is there any setup for a maximum connection control?
Complete Error:
ORDS was unable to make a connection to the database. This can occur if the database is unavailable, the maximum number of sessions has been reached or the pool is not correctly configured. The connection pool named: |apex|pu| had the following error(s): Exception occurred while getting connection: oracle.ucp.UniversalConnectionPoolException: All connections in the Universal Connection Pool are in use

That error means the pool has been exhausted. 10 is the DEFAULT pool size, and is almost NEVER correct for a production deployment.
It's very likely a decently active application will use all 10 connection from a pool, resulting in the exact error you are seeing.
So the answer: increase the max connections property for your pool, and restart ORDS. The hard part is: based on your application performance and activity profile, how large should the pool be?
Some good advice can be found here from our Real World Performance Team.

You can use the jdbc.MaxLimit parameter when configuring ORDS. It defaults to 10 as the maximum number of connections.
jdbc.MaxLimit
Specifies the maximum number of connections.
Defaults to 10. (Might be too low for some production environments.)
Using a command like java -jar ords.war set-property jdbc.MaxLimit 50 would set the maximum number of connections to 50 (after reloading ORDS or restarting WebLogic).

Related

Azure VM is closing the idle sessions in my postgres database

I am creating a VM in azure to upload a postgres instance in docker and connect to it with my local backend in Spring. What happens is that once connected to the DB after X time of inactivity when trying to make a request I get the following "HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection#f162126 (This connection has been closed.). Possibly consider using a shorter maxLifetime value." digging around I realized that it is as if my VM has some kind of behavior that when a connection becomes inactive it closes it causing the above error. The curious thing here is that the sessions are not closed as you can see in the following image even shutting down my backend the sessions are maintained and the only options to delete them is restarting the container in which the DB is hosted.
I have tried to reproduce this behavior on local but it never happens even if I leave the backend idle for an hour if I do the request to the DB it works as if nothing, it only happens with my VM in azure.
I want to clarify that the sessions that appear in the attached image no longer work, i.e. if I try to consume the DB from spring, the error I mentioned appears and automatically Hikari creates new sessions for its pool and I can reproduce this behavior until I reach 100 sessions that after a while would not work again and that Spring never closes when shutting down the backend.
HikariPool-1 - Failed to validate connection org.postgresql.jdbc.PgConnection#f162126 (This connection has been closed.). Possibly consider using a shorter maxLifetime value.
This error is thrown by method isConnectionDead. While checking the connection, if it's still alive & can be used and it will issue an above error if it has already been closed.
You can adjust your maxLifetime setting, to resolve this problem. 30000ms (30 seconds) is the shortest value that is permitted (30 seconds), 1800000ms (30 minutes) is by default value.
A connection in the pool can only last for a certain amount of time which is controlled by the maxLifetime attribute. the value of this property it should be several seconds below any connection time restrictions set by the infrastructure or any databases.
Reference: Hikari Configuration Github.
Well, after much research and reviewing various sources it turns out that azure when creating a VM has certain security policies as Pedro Perez says in the following post in Stack Excahnge: Azure closing idle network connections
You're hitting a design feature of the software load balancer in front of your VMs. By default it will close any idle connections after 4 minutes, but you can configure the timeout to be anything between those 4 and 30 minutes
So in order to overwrite this policy that governs your VM you must do the process of creating a load balancer, do all the relevant configuration and create a Load balancing rule for port 5432 which is the default port of postgres and set the Idle timeout in a range of 4 to 30 min according to your needs.
finally configure your VM so its public ip points to the LB(Load Balancer) public ip and everything will work normally.
It is true that if you simply want to take advantage of Azure's default security policies on the VMs you create you should set the maxLifetime to a maximum of 4 minutes in your Spring application.properties or appliation.yml as #PratikLad says.
in my case I prefer to leave the default Hikari configuration (maxLifetime of 30 mins) so I need to create the LB but if you prefer to change the property by setting it to a maximum of 4 min you would not need to do all the above mentioned on the LB.

TCPListener in golang: error when number of connections is above 60 / 260

I am building TCP Proxy: client <-> proxy <-> Vertica
I have a net.TCPListener, which takes incoming requests by AcceptTCP() and creating connections, then, making connection to destination socket by net.DialTCP("tcp", nil, raddr). Looks like a bridge. Default proxy model.
Firstly, at first version, i have a trouble: if i have 59 parallel incoming request, everything is fine. But if i have one more (60), i have a trouble: 1-59 connections are OK, but 60 and newer are fault. I cant catch error properly. Looks like some socket unexpectedly closes
Secondly, i tried to set queue for listener. It helps me a lot: but if i have more than 258 requests, i get error again.
My question: is there any limit of connections in net package? May be it is system limitation?
For external info: Vertica running in docker container, hw/system: macbook, vertica limit connection pool: 5, but pool logic implemented into proxy.
I also tried set "raw" proxy without pool logic (thats why i set queue for listener: i must not exceed threshold of Vertica User's pool), result is 258 requests..
UPDATED: (05.04.2020)
Looks like it is system limitations fault. Did I mention anywhere that I trying to run the whole system on one PC?
So, what I had:
300 parallel processes as requests (making by multiprocessing.Pool
Python) (300 sockets)
Listener that creates 300 connections (once
more 300 sockets)
And series of rapidly creating/closing sockets in
deep of proxy (according to queue and Vertica pool)
What I have now:
300 python requests making from another PC in my local network (on Windows)
Proxy works fine
But I have several errors on Windows PC, which creating requests to my proxy. Errors like low memory in "swap file".
I still need to make some stress test for proxy. Adding less memory for swap file didn't solve my problem on Windows PC. I will be grateful for any suggestions and ideas. Thanks!
How does the proxy connect to Vertica?
There is by default a maximum of 50 ordinary mortal users to be connected to one Vertica node at any one time. The superuser "dbadmin" always has 5 connections in addition to that.
So if I try to connect 60 times as dbadmin, I get this on a default Vertica configuration:
Connection attempt failed: FATAL 4060: New session rejected due to limit, already 55 sessions active
You can increase the Vertica config item MaxClientSessions from its default of 50 per node.
Command is : ALTER NODE <_node_name_> SET MaxClientSessions = 100, for example.
I suppose you are always connecting to the same Vertica node, and that you have set ConnectionLoadBalancing to FALSE. So you always connect to the same node, and soon reach the default maximum of 50.
Hope that's the reason found ....

Oracle Dataguard TAF (Transparent Application Failover) Issues

we have configured oracle TAF (Transparent Application Failover) for a dataguard database so that application can use same service name to connect database in case of any issue with primary database and have to switch to standby db but we are having a unique problem where application servers within the datacenter are able to connect to db but servers from different datacenter are failing to connect using taf service ..after 90 sec timeout interval its trying to connect to standby host and failing
Connection using direct hostname and sid are working perfectly fine even across the datacenter
Error :
Caused by: java.io.IOException: Socket read timed out, socket connect lapse 3 ms. plx9852.xyz.com/135.167.30.103 1524 3 1 true
at oracle.net.nt.TcpNTAdapter.connect(TcpNTAdapter.java:209)
at oracle.net.nt.ConnOption.connect(ConnOption.java:161)
at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:470)
... 54 more
pcdrest_taf.db.xyz.com=
(description=(connect_timeout=90)(retry_count=30)(retry_delay=3)(transport_connect_timeout=3)(load_balance=off)(failover=on)(address_list=(address=(protocol=tcp)(host=plx9843.xyz.com)(port=1524))(address=(protocol=tcp)(host=plx9852.xyz.com)(port=1524)))(connect_data=(service_name=pcdrest_taf.db.xyz.com)(failover_mode=(type=select)(method=basic))))
connection string on application using LDAP :
spring.datasource.jdbcUrl=jdbc:oracle:thin:#ldap://polarx.xyz.com:3060/pcdrest_taf,cn=OracleContext,dc=db,dc=xyz,dc=com ldap://polarx1.xyz.com:3060/pcdrest_taf,cn=OracleContext,dc=db,dc=xyz,dc=com ldap://polarx2.sbc.com:3060/pcdrest_taf,cn=OracleContext,dc=db,dc=xyz,dc=com ldap://polarx3.sbc.com:3060/pcdrest_taf,cn=OracleContext,dc=db,dc=xyz,dc=com ldap://polarx4.sbc.com:3060/pcdrest_taf,cn=OracleContext,dc=db,dc=xyz,dc=com ldap://polarx5.sbc.com:3060/pcdrest_taf,cn=OracleContext,dc=db,dc=xyz,dc=com
Just beware Oracle changed meaning of transport_connect_timeout from seconds into milliseconds without any warning in release 12.1.
So if you use this version there is no way to tell whether 3 means seconds or milliseconds.
Since ver 12.2, your value of 3 (miniseconds) is value is too low.
Moreover there were several bugs in Oracle JDBC driver related to TAF. For example:
Bug 12998506 RETRY_COUNT connection parameter is total number of connection attempts when using JDBC thin Description
The RETRY_COUNT connection parameter is the number of additional times
a connection attempt should be made after the initial attempt has
failed. Therefore if RETRY_COUNT is 2 a maximum of 3 connection
attempts will be made for each address in the ADDRESS_LIST. However
JDBC thin takes RETRY_COUNT to mean the total number of connection
attempts so, in the above example, JDBC thin will make a maximum of 2
attempts for each address instead of the expected 3.
This is a follow on from bug 12760352 where addresses in the
ADDRESS_LIST were being retried in the wrong order when using JDBC
thin (e.g. if the address list contained A and B JDBC thin would
attempt connections as A A ... B B ... instead of A B A B ...).
PS: the parameter retry_delay seems to be ignored by JDBC drivers since ver. 12c and higher.

Connection pool opens more connections then maximum pool size

Hey I'm using Glassfish open source v4 and I'm having a weird problem.
I have defined a JDBC connection pool to Oracle 11g in the admin console and I've set :
Pool Settings
Initial and Minimum Pool Size: 500
Maximum Pool Size: 1000
Pool Resize Quantity: : 750
And I've created a specific user for this connection pool. Yet sometimes when I inspect opened connections in the database I see that there are more then 1000 (maximum I've seen was 1440)
When this happens any query attempts fail, sometimes with OutOfMemory exception, some show http thread interuptions and some don't show any logs at all, just takes a long time.
What I am wondering is how is it possible the Glassfish opens more connections then I've defined it to?
1t try to compare output from netstat on appl. server and db server side. You may have some "dangling" connections. Also try to find some documentation about DCD (Dead connection detection) in Oracle.
Few years ago I saw situations where Java application server thought that the connection is dead because it is not responding for few minutes. So this connection was put onto some dead connection list and a new connection was created.
There also can be some network issues - for example there is a FW between appl and db server.
When TCP connection is not active for one hour then it's cut over on one side but DB sever does not know about that.
The usual way how to investigate that is
compare output of both netstat(s) (appl./db)
identify dangling TCP connections
translate TCP connection onto Unix process id(PID) of Oracle session process
translate PID onto Oracle session (SID and SERIAL#)
kill the session on Oracle level (alter system kill session ...)

Oracle OLEDB Connection Pooling and Invalid Connections

We are using ADO to access Oracle 10g release 2, Oledb provider for Oracle 10g. We are facing some issue with the connection pooling. The database reside on the remote machine and connection pooling is occuring as it should. But if the remote machine goes down for some reason, the connection is returned from the pool and query on that connection fails. When this connection is closed, it is returned back to the pool instead of being invalid. The subsequent connection opening requests are sucessfull but query fails. This is strange behaviour, according to OLEDB specifications, provider must support DBPROP_CONNECTIONSTATUS property, thus in case of invalid connection, it would not be returned back to the pool.
Things get weired when the remote machine comes up. The connections in the pool are still invalid and although the connection opening succeeds, query on the connection fails. Oracle OLEDB is unable to connect to the server anymore and we have to restart our application. Well this is undesired cause our application is a critical application.
Any ideas on how to get over this.
Thanks
Mubashir
If you are doing this programmatically, use a try block, so that if something does happen, it won't fail. With a try block, you can catch an exception and ignore it, so that the errors are shushed.
You could tell the pool to not accept invalid connections, by marking the connection invalid before it is returned to the pool.
Connections are recovered after 10 minutes by default. Time can be set by the registry key SPTimeout under the oledb provider's root key.
Actually the default connection pool timeout is 120 seconds at least for this Oracle 11 32-bit OLEDB installation. You can find the registry settings at:
Computer\HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Oracle\KEY_orac
Where KEY_orac is the KEY_<oracle_home_name>
The key name is ORAMTS_CONN_POOL_TIMEOUT and the default value is 120.
It does not appear that you can set connection pool parameters at the connection string level.
In most connection pool implementation it is possible to check the connection before using it. For example: you define a check query like select * from dual and if you pick up a connection from the pool this query will be executed. If it fails, then the connection will be excluded from the pool and a new one will be opened.

Resources