Connection pooling with HikariCP, springboot and kubernetes - spring-boot

I am using hikariCP for connection pooling in my reactive spring boot application running in kubernetes cluster. There will be lots of blocking calls and multiple database queries, so ideally more no of database connections would help, provided the availability of cpu cores.
Providing all the cpu core to one kubernetes container will waste resource as the spike in requests will not always be there. So I am trying to explore how to utilize the autoscaler in kubernetes so that new application containers can be spinned up with increase in the no of requests. Two concerns:
I tried the hikari configuration com.zaxxer.hikari.blockUntilFilled=true to keep the no of connections filled up during the application startup. But when using autoscaler with increasing no of requests, this will cause delays in the response as connection creation in the pool would take time. Is it better to use hikari's dynamic connection creation based on spike in demand rather than creating all the connections at once (during the startup).
Also, each kubernetes container will be a new instance of application, how do we manage the no of database connections created.
I did a sample load test with jmeter and could see improved performance (and no timeouts etc) with large no of requests when using a fixed no of active database connections. There were large no of thread interrupted exceptions when there was no fixed connection pool size provided and connections were getting created dynamically with increased no of requests.
Any insights will help.

Related

SpringBoot and HikariCP relationship

SpringBoot already is managing dataConnection then why is Hikari CP needed?
I have just started using SpringBoot so do not know much about SpringBoot and Hikari relation, although i read about Hikari but couldn't find any explicit explanation about its relationship with Springboot in presence of Spring data connection.
I read that Hikari is used when we need heavy db operations with lots of connections, if it is true then should we not use Hikari in follwoing scenario?
Scenario:
There is a small application, having maximum 8-10 REST calls once in a month or maximum fortnightly.That application needs to perform some probability and statistics related calculation.
Users login on that app at a time are of maximum 2-3 in numbers.
Do we still need to use Hikari?
There are two ways to communicate with the database from your application. You can either open a new DB connection any time you wish execute some query there, or you have a connection pool. Connection pool is a collection of reusable connections that application uses for DB communication. As establishing a new connection is relatively expensive operation, using connection pool gives you a significant performance improvement.
HikariCP is one of the connection pools libraries available in java and SpringBoot uses it as a default. As you don't need to do anything special to have it in your application, just enjoy your free lunch :)
HikariCP is used as the default connection pool in SpringBoot2, it was TomcatJDBC in SpringBoot 1. You must be using it as a default in your settings. You can overwrite it by setting another connection pool in your setting properties if you need. Please find more details about the connection pools and the default configurations of Spring Boot versions here.
Hikari is the default DataSource implementation with Spring Boot 2. This means we need not add explicit dependency in the pom.xml. The spring-boot-starter-JDBC and spring-boot-starter-data-JPA resolve it by default. To sum up, you require no other steps with Spring Boot 2.
Compared to other implementations, it promises to be lightweight and better performing.
Tuning Hikari Configuration Parameters:
spring.datasource.hikari.connection-timeout = 20000 #maximum number
of milliseconds that a client will wait for a connection
spring.datasource.hikari.minimum-idle= 10 #minimum number of idle
connections maintained by HikariCP in a connection pool
spring.datasource.hikari.maximum-pool-size= 10 #maximum pool size
spring.datasource.hikari.idle-timeout=10000 #maximum idle time for
connection
spring.datasource.hikari.max-lifetime= 1000 # maximum lifetime in
milliseconds of a connection in the pool after it is closed.
spring.datasource.hikari.auto-commit =true #default auto-commit
behavior.
HikariCP is a reliable, high-performance JDBC connection pool. It is much faster, lightweight, and has better performance as compared to other connection pool APIs. Because of all these compelling reasons, HikariCP is now the default pool implementation in Spring Boot 2. In this article, we will have a closer look to configure Hikari with Spring Boot.

Spring boot/Amazon PostgreSQL RDS connection pool issue

I am troubleshooting an issue with a Spring Boot app connecting to a PostgreSQL database. The app runs normally, but under fairly moderate load it will begin to log errors like this:
java.sql.SQLException: Timeout after 30000ms of waiting for a connection.
This is running on an Amazon EC2 instance connecting to a PostgreSQL RDS. The app is configured like the following:
spring.datasource.url=jdbc:postgresql://[rds_path]:5432/[db name]
spring.datasource.username=[username]
spring.datasource.password=[password]
spring.datasource.max-active=100
In the AWS console, I see 60 connections active to the database, but that is across several Spring Boot apps (not all this app). When I query the database for current activity using pg_stat_activity, I see all but one or 2 connections in an idle state. It would seem the Spring Boot app is not using all available connections? Or is somehow leaking connections? I'm trying to interpret how pg_stat_activity would show so many idle connections and the app still getting connection pool time outs.
Figured it out. Spring is using the Hikari database connection pooling (didn't realize that until after more closely inspecting the stack trace). Hikari configuration parameters have different names, to set the pool size you use maximum-pool-size. Updated that and problem solved.

Should Hystrix replace existing JDBC/HTTP connection pools, or delegate to them?

Many applications use connection pools for both HTTP and JDBC calls for resiliency. But using and configuring these 2 types of pools is very different. This duplicates the complexity of implementing resiliency patterns that are common to both - such as timeouts, retries, caching / alerting fallbacks, circuit breaking, and monitoring.
To my mind Hystrix offers common approaches of configuring and implementing these same resiliency patterns for both HTTP and JDBC calls.
My questions are:
Could Hystrix theoretically replace existing HTTP and JDBC
connection pools entirely?
If so, what are the pros and cons of doing so?
Replacing them entirely reduces the world of complexity that surrounds these connection pools - with their attendant timeout and validation query properties etc. However I am hazy about how Hystrix could "keep alive" JDBC / HTTP connections - and therefore avoid expensive connection setup costs - without delegating to existing libraries specialized for these tasks.
For context I have a DropWizard app, which uses Tomcat DBCP for its JDBC connection pool and Apache HttpClient for its HTTP connection pools.
No, Hystrix can not replace your connection pools.
Hystrix' main features are:
Limiting the number of calls to a service by using a limited thread pool or semaphores.
The possibility to time out calls to a service to avoid application threads being locked up waiting for slow/hung services.
Adding bulkheads so that one slow service minimally affects the rest of the application.
Circuit breaking slow/hung services.
The is no support for pooling connections.
I guess you can argue that the first point is somewhat related to a connection pool in that both Hystrix and a connection pool can limit the load against an other system. However, the main reason to have a connection pool is the performance gain of pooling connection. This load-limiting behavior is basically a bonus of connection pooling.
Hystrix could however compliment connection pools by providing the fail-fast timeout behavior and bulkheads if added in front of your connection pools, as you suggest in your question.

Can I use separate non connection pool data source for long running but infrequent tasks?

My application stack consists of Spring MVC, Hibernate and MySQL hosted on Apache tomcat 7.
I have set up Spring to manage transactions and Hibernate session factory is utilizing the tomcat dbcp connection pool backed datasource for getting the connection.
I have a use case in my application in which I have a run a long running task which is initiated through the web UI (say a button click). This task runs for let’s say 10 minutes then my connection pool starts to throw connection closed exceptions. This is obviously because of connection pool setting in which if the connection is not returned to pool after a specific time, it is marked as abandoned and later removed. I could solve this by tinkering with the timeout settings and increasing it to a large enough value. But I may have several other use cases like this and may not currently have idea how long those will run.
So I am thinking of another approach here.
This use case will be initiated not very often, so I may use a separate datasource definition without using connection pool. Of course I can set two transaction managers in Spring with different names “abc” and “xyz” and use the #Transactional(name=”abc”) and #Transactional(name=”xyz)”. Both these transaction managers would use their respective datasources – one with connection pool to support common use cases and one without connection pool to support long running transaction. This way I won’t have to worry about changing the timeout configurations.
Will this be a generally accepted solution or should I take the timeout configuration approach?
Avoiding to use the connection pool will cause problems if you don't have another way to limit the number of connections that your application can initiate. For example (trivial example of cours) if your going to launch your batch process each time a user clicks a button, make sure you limit the times they can do this task.
Another way would be to define a new jdbc resource in your application server (jdbc/batchprocess) and configure in this resource a longer timeout. Then change from one to another using dynamic datasource routing.
You can open Hibernate Sessions, supplying your own Connection:
sessionFactory.withOptions().connection( yourConnection ).openSession();

Spring JMS: Creating multiple connection to a queue

To process a large number of messages coming to a queue i need guarantee of at least one jms connection to be there at any time. I am using spring and spring allows to have multiple sessions on a single connection only. In case one and only connection fails, application will come to standstill till spring reconnects to the JMS bridge.
So how can i create more than one connection to a queue in Spring, also how can i do connection pooling here.
The answer to this depends on whether you are using Spring inside a J2EE container(jboss etc.) or in a standalone application.
Standalone - you'll find pooling connections to be a problem. Springs SingleConnectionFactory can be setup to renew the connection on an exception garaunteeing that at some point a connection will come online and start processing the queue again, but you'll still have the problem of waiting for that single connection to renew, plus depending on what messaging implementation your dealing with and how it does load balancing you may find yourself stuck with a connection to a single node in a cluster.
If you are running in a container you can rely on the containers connection factory which will be much more robust. JBoss Messaging in the container for instance will failover seamlessly to other nodes and handles pooling under the covers, but if your working in the container its usually easier to bail on JMS template which kind of sucks and use whatever that container provides.

Resources