Spring mongo driver connection pool close error - spring

I have spring webflux/reactive server utilizing a singleton mongo database instance running on the same machine. Now, I have an rest endpoint in the server which triggers an external etl(python script using pymongo connection) on the db. But it then leads to pool close error on my spring server, and any subsequent database operations from the server fails.
2020-02-16T01:25:58.320+0530 [QUIET] [system.out] 2020-02-16 01:25:58.321 INFO 93553 --- [extShutdownHook] org.mongodb.driver.connection : Closed connection [connectionId{localValue:3, serverValue:133}] to localhost:27017 because the pool has been closed.
My etl runs for 10 secs, but mongo driver never reconnects after connection is closed from the pymongo side.
I tried mongodb configuration flags, but failed, don't know if there is a way. I am also willing to reconnect to mongodb on every rest call to avoid this, but any ideas/suggestions there.
I was hoping mongo_client should provide onClose() function, for the application to handle disconnections, but could not find any such handler.

Related

HikariPool-1 - Connection is not available -webclient

I have some client class to external data provider.
In this class, I'm using webclient reactive client and crud repository.
Webclient uses repo to save responses from client (business requirement). For instance in onError, onStatus etc methods uses this repository. We performed load tests and it's working well.
The problem is when external API is not working and we're retrying couple of times (exponential backoff). Then I get:
HikariPool-1 - Connection is not available, request timed out after 30005ms
org.springframework.dao.DataAccessResourceFailureException: Unable to acquire JDBC Connection; nested exception is org.hibernate.exception.JDBCConnectionException: Unable to acquire JDBC Connection
So it seems like webclient is keeping connection while retrying for 30 seconds and we're running out of connections. Extending connection pool size is not the way I want to fix that.
Is there any way to release connection while webclient is jsut waiting for another retry and so on?

Playtika's OSS Feign Client: org.springframework.web.reactive.function.client.WebClientRequestException: Connection prematurely closed BEFORE response

The issue
I've stumbled upon the issue:
Error message: org.springframework.web.reactive.function.client.WebClientRequestException: Connection prematurely closed BEFORE response; nested exception is reactor.netty.http.client.PrematureCloseException: Connection prematurely closed BEFORE response
General info about the issue
It's a Spring Boot app (2.4.5) running on Reactive (WebFlux) stack.
The app also uses Playtika OSS reactive Feign Client (starter 3.0.3) for synchronious REST API communication.
Underlying web client is Netty.
There are no any special Feign or WebClient configs in the app.
All the other microservice parties are running on embedded Tomcat with default Spring Boot autoconfigurations.
All apps are running in Kubernetes cluster.
The error log observed from time to time (not every day).
Guesses
After some investigation, my best guess would be that some long-lived connections are being dropped from the pool on certain conditions. This causing the error log.
This thought is based on Instana that connects the error log to the span that spans accross a lot of subcalls.
Also no data losses/ other inconsistencies were noticed so far xD
Questions
Does Feign has a connection pool by default?
How to know if those are live or idle connections from the pool being closed?
How the connection pool can be configured or dropped to avoid long-running connections ?
Is it possible that Kubernetes can somehow close these connections?
What else can close connections?

Hikari CP (Spring Boot) Connection Recovery Problem After DB Failure

We have several microservices build on Spring Boot (2.2.4) and Hikari CP (3.4.2) with PostgreSQL.
Recently we have faced DB failure around 30 seconds. After the connections are lost some of the containers are failed to recover connections while others which has exactly the same configuration and application are just fine. Unfortunately we don't have the log indicating the pool sizes(idle active waiting) on time of the error.
We have received some broken pipe and connection lost errors on all containers when the connections are lost. After DB recovery we got the following exception only on some (2/18) containers that are failed to recover.
StackTrace:
org.springframework.orm.jpa.JpaTransactionManager.doBegin(JpaTransactionManager.java:402) ... 20 moreCaused by:
java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 30000ms. at
com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:689) at
com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:196) at
com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:161) at
com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:128) at
org.hibernate.engine.jdbc.connections.internal.DatasourceConnectionProviderImpl.getConnection(DatasourceConnectionProviderImpl.java:122) at
org.hibernate.internal.NonContextualJdbcConnectionAccess.obtainConnection(NonContextualJdbcConnectionAccess.java:38) at
org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java:104)
... 30 moreCaused by:org.postgresql.util.PSQLException: This connection has been closed. at
org.postgresql.jdbc.PgConnection.checkClosed(PgConnection.java:857) at
org.postgresql.jdbc.PgConnection.setNetworkTimeout(PgConnection.java:1639) at
com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:556) at
com.zaxxer.hikari.pool.PoolBase.isConnectionAlive(PoolBase.java:169) at
com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:185) ... 35 more
we have seen similar(on the same system) situations and tests where the DB failovers and connections are restored on Hikari without any problem. But in this case one of the containers are restored by itself after 1 hour and others after restart.
As far as we know Hikari is not returning the broken connections on the pool and evicts them from the pool after marked as broken or closed. Any ideas what might happened to those containers while the others(exactly same image and configuration) are just fine.
PS: we cannot reproduce the problem.
Hikari configuration:
allowPoolSuspension.............false
connectionInitSql...............none
connectionTestQuery.............none
connectionTimeout...............30000
idleTimeout.....................600000
initializationFailTimeout.......1
isolateInternalQueries..........false
leakDetectionThreshold..........0
maxLifetime.....................1800000
maximumPoolSize.................15
minimumIdle.....................15
validationTimeout...............5000
You can configure something like:
connectionTestQuery=select 1
This way Hikari tests that the connection is still alive before handling it over to Hibernate.

Spring Boot Integrating Testing: connection pool leaking

I have a spring boot application (1.5) that uses #Repos, #PersistenceContext and Connection Pooling (C3PO with mssql-jdb; 6.1.0.jre8) that connects to an Azure SQL Database. However, we are hitting connection errors when running our test suite. When doing netstat while running the integration tests, I'm seeing the ESTABLISHED connections expand without bound. The number of connections hits ~250 connections and then I start seeing Connection Pool exceptions and everything eventually dies.
My question is what' s the proper way to handle this situation? Is there a way to turn off connection pooling when doing integration testing or do I need to manually deactivate connection poolings at the end of a test?

Spring boot/Amazon PostgreSQL RDS connection pool issue

I am troubleshooting an issue with a Spring Boot app connecting to a PostgreSQL database. The app runs normally, but under fairly moderate load it will begin to log errors like this:
java.sql.SQLException: Timeout after 30000ms of waiting for a connection.
This is running on an Amazon EC2 instance connecting to a PostgreSQL RDS. The app is configured like the following:
spring.datasource.url=jdbc:postgresql://[rds_path]:5432/[db name]
spring.datasource.username=[username]
spring.datasource.password=[password]
spring.datasource.max-active=100
In the AWS console, I see 60 connections active to the database, but that is across several Spring Boot apps (not all this app). When I query the database for current activity using pg_stat_activity, I see all but one or 2 connections in an idle state. It would seem the Spring Boot app is not using all available connections? Or is somehow leaking connections? I'm trying to interpret how pg_stat_activity would show so many idle connections and the app still getting connection pool time outs.
Figured it out. Spring is using the Hikari database connection pooling (didn't realize that until after more closely inspecting the stack trace). Hikari configuration parameters have different names, to set the pool size you use maximum-pool-size. Updated that and problem solved.

Resources