HIKARICP Killing Connections - spring

Can someone help me understand the Hikari implementation better. We currently have the following settings in use:
spring.datasource.hikari.maxLifetime=600000
spring.datasource.hikari.maximumPoolSize=3
spring.datasource.hikari.minimumIdle=1
What we see, is that after 10 min, the applications connections are cleared, and then, it struggles to make new connections to the DB, resulting in sessions exceeded errors, from the DB, we understand what they are, but it seems that rather than use the existing sessions, Hikari is trying to create new sessions, and the DB is not allowing that as is it supposed to.
Why is Hikari trying to establish new connections and not use connections from the available sessions ont he DB?
How does the maxlifetime work? reading the documents on the Hikari Git hub page, it seems straightforward enough.
We have also set
spring.datasource.hikari.idle-timeout=10000
on the idea that it may not kill those connections quite so quickly, but it still seems to remove them...
Thanks

Related

Getting subsequent connections using HikariCP during same request seems slow

I have a Spring boot app that use HikariCP for Postgres connection pooling.
Recently I've set up tracing to collect some data how time is spent when handling a request to a specific endpoint.
My assumptions are that when using HikariCP:
The first connection to the database while handling the request might be a bit slower
Subsequent connections to the database should be fast (< 10 ms)
However, as the trace shows, the first connection is fast (< 10 ms). And while some subsequent connections during the same request handling are also fast (< 10 ms), I frequently see some subsequent connections taking 50-100ms, which seems quite slow to me, although I'm not sure if this is to be expected or not.
Is there anything I can configure to improve this behavior?
Maybe good to know:
The backend in question doesn't really see any other traffic right now, so it's only handling traffic when I manually send requests to it
I've changed maximumPoolSize to 1 to rule out that the issue is that it uses different connections in the context of 1 request and that's what causes the issue. The same behavior is still seen.
I use the default Hikari settings, I don't change them.
I do think something is wrong with your pool configuration or your usage of the pool if it takes roughly 10 ms to get an already initialized connection from your pool. I would expect it to be sub-millisecond... Are you sure you are using the pool correctly?
Make sure you are using as new versions of pool and driver as possible, and make sure that connectionTestQuery is not set, as that would execute a query every time the connection is obtained from the pool. The defaults should be good enough for the rest of the settings.
Debug logs could be one thing help figure out what is happening, metrics on the pool another. Have a look at Spring Boot Actuator, it will help you with that...
To answer your actual question on how you can improve the situation given it actually takes roughly 10 ms to obtain a connection: Do not obtain and return the connection to the pool for every query... If you do not want to pass the connection around in your code, and if it suits your use case, you can make this happen easily by making sure your whole request is wrapped in a transaction. See the Spring guide on managing transactions.

How do you use go-sql-driver when you have a sharded MySQL database solution?

Reading this article: http://go-database-sql.org/accessing.html
It says that the sql.DB object is designed to be long-lived and that we should not Open() and Close() databases frequently. But what should I do if I have 10 different MySQL servers and I have sharded them in a way that I have 511 databases in each server for example the way Pinterest shards their data with MySQL?
https://medium.com/#Pinterest_Engineering/sharding-pinterest-how-we-scaled-our-mysql-fleet-3f341e96ca6f
Then would I not need to constantly access new nodes with new databases all the time? As I understand then I have to Open and Close the database connection all the time depending on which node and database I have to access.
It also says that:
If you don’t treat the sql.DB as a long-lived object, you could
experience problems such as poor reuse and sharing of connections,
running out of available network resources, or sporadic failures due
to a lot of TCP connections remaining in TIME_WAIT status. Such
problems are signs that you’re not using database/sql as it was
designed.
Will this be a problem? How should I solve this issue then?
I am also interested in the question. I guess there could be such solution:
Minimize number of idle connection in pool db.SerMaxIdleConns(N)
Make map[serverID]*sql.DB. When you have no such connection - add it to map.
Make Dara more local - so backends usually go to “their” databases. However Pinterest seems not to use it.
Increase number of sockets and files on backend machines so they can keep more open connections.
Provide some reasonable idle timeout so very old unused connections could be closed.

Should I explicitly close RethinkDB connections?

I'm a little hazy on how connections in RethinkDB work. I'm opening a new connection every time I execute queries without closing them once the queries finish.
Is this a good practice? Or should I be explicitly closing connections once queries are finished?
(I'm using the JS driver. I don't believe the documentation speaks to this)
[edited cuz the previous post title was vague]
You should explicitly close connections, otherwise you will exhaust the database server. I'm assuming you are running node.js, which will keep connections until you kill the application.
Preferrably you would use a pool, to lessen the overhead of connecting. For a pre-made solution, look into rethinkdbdash which is basically the same API as the official one, but with builtin pooling.

JDBC connection pool manager

We're in the process of rewriting a web application in Java, coming from PHP. I think, but I'm not really sure, that we might run into problems in regard to connection pooling. The application in itself is multitenant, and is a combination of "Separate database" and "Separate schema".
For every Postgres database server instance, there can be more than 1 database (named schemax_XXX) holding more than 1 schema (where the schema is a tenant). On signup, one of two things can happen:
A new tenant schema is created in the highest numbered schema_XXX database.
The signup process sees that a database has been fully allocated and creates a new schemas_XXX+1 database. In this new database, the tenant schema is created.
All tenants are known via a central registry (also a Postgres database). When a session is established the registry will resolve the host, database and schema of the tenant and a database session is established for that HTTP request.
Now, the problem I think I'm seeing here is twofold:
A JDBC connection pool is defined when the application starts. With that I mean that all databases (host+database) are known at startup. This conflicts with the signup process.
When I'm writing this we have ~20 database servers with ~1000 databases (for a total sum of ~100k (tenant) schemas. Given those numbers, I would need 20*1000 data sources for every instance of the application. I'm assuming that all pools are also, at one time or another, also started. I'm not sure how much resources a pool allocates, but it must be a non trivial amount for 20k pools.
So, is it feasable to even assume that a connection pool can be used for this?
For the first problem, I guess that a pool with support for JMX can be used, and that we create a new datasource when and if a new schemas_XXX database is created. The larger issue is that of the huge amount of pools. For this, I guess, some sort of pool manager should be used that can terminate a pool that have no open connections (and on demand also start a pool). I have not found anything that supports this.
What options do I have? Or should I just bite the bullet and fall back to an out of process connection pool such as PgBouncer and establish a plain JDBC connection per request, similar to how we're handling it now with PHP?
A few things:
A Connection pool need not be instantiated only at application start-up. You can create or destroy them whenever you want;
You obviously don't want to eagerly create one Connection pool per database or schema to be open at all times. You'd need to keep at least 20000 or 100000 Connections open if you did, a nonstarter even before you get to the non-Connection resources used by the DataSource;
If, as is likely, requests for Connections for a particular tenant tend to cluster, you might consider lazily, dynamically instantiating pools, and destroying them after some timeout if they've not handled a request for a while.
Good luck!

Is it possible to list all database connections currently in the pool?

I'm getting ActiveRecord::ConnectionTimeoutError in a daemon that runs independently from the rails app. I'm using Passenger with Apache and MySQL as the database.
Passenger's default pool size is 6 (at least that's what the documentation tells me), so it shouldn't use more than 6 connections.
I've set ActiveRecord's pool size to 10, even though I thought that my daemon should only need one connection. My daemon is one process with multiple threads that calls ActiveRecord here and there to save stuff to the database that it shares with the rails app.
What I need to figure out is whether the threads simply can't share one connection or if they just keep asking for new connections without releasing their old connections. I know I could just increase the pool size and postpone the problem, but the daemon can have hundreds of threads and sooner or later the pool will run out of connections.
The first thing I would like to know is that Passenger is indeed just using 6 connections and that the problem lies with the daemon. How do I test that?
Second I would like to figure out if every thread need their own connection or if they just need to be told to reuse the connection they already have. If they do need their own connections, maybe they just need to be told to not hold on to them when they're not using them? The threads are after all sleeping most of the time.
You can get to the connection pools that ActiveRecord is using through ActiveRecord::Base.connection_handler.connection_pools it should be an array of connection pools. You probably will only have one in there and it has a connections method on it. To get an array of connections it knows about.
You can also do a ActiveRecord::Base.connection_handler.connection_pools.each(&:clear_stale_cached_connections!) and it will checkin any checked out connections which thread is no longer alive.
Don't know if that helps or confuses more
As of February 2019, clear_state_cached_connections has been deprecated and moved to reap
Commit
Previous accepted answer updated:
ActiveRecord::Base.connection_handler.connection_pools.each(&:reap)

Resources