What database connection pool could be used to load-balance connections from a Tomcat web container to one of several Oracle database servers without using RAC clustering?
I'm assuming these are read-only databases or you're not concerned connections will get different data. If you want the data to be the same, you can use streams replication which is doesn't require RAC.
The connection load balancing and failover happens in the listener. There's a lot of flexibility in how this works and this should get you started:
http://download.oracle.com/docs/cd/E11882_01/network.112/e10836/advcfg.htm#sthref858
The first part shows a simple client based load balance which is essentially picking a connection at random. Farther down it shows how to load balance based on actual server load.
Look into DRCP if using 11g
Related
Reading this article: http://go-database-sql.org/accessing.html
It says that the sql.DB object is designed to be long-lived and that we should not Open() and Close() databases frequently. But what should I do if I have 10 different MySQL servers and I have sharded them in a way that I have 511 databases in each server for example the way Pinterest shards their data with MySQL?
https://medium.com/#Pinterest_Engineering/sharding-pinterest-how-we-scaled-our-mysql-fleet-3f341e96ca6f
Then would I not need to constantly access new nodes with new databases all the time? As I understand then I have to Open and Close the database connection all the time depending on which node and database I have to access.
It also says that:
If you don’t treat the sql.DB as a long-lived object, you could
experience problems such as poor reuse and sharing of connections,
running out of available network resources, or sporadic failures due
to a lot of TCP connections remaining in TIME_WAIT status. Such
problems are signs that you’re not using database/sql as it was
designed.
Will this be a problem? How should I solve this issue then?
I am also interested in the question. I guess there could be such solution:
Minimize number of idle connection in pool db.SerMaxIdleConns(N)
Make map[serverID]*sql.DB. When you have no such connection - add it to map.
Make Dara more local - so backends usually go to “their” databases. However Pinterest seems not to use it.
Increase number of sockets and files on backend machines so they can keep more open connections.
Provide some reasonable idle timeout so very old unused connections could be closed.
I am actually reading Oracle-cx_Oracle tutorial.
There I came across non-pooled connections and DRCP, Basically I am not a DBA so I searched with google but couldn't found any thing.
So could somebody help me understand what are they and how they are different to each other.
Thank you.
Web tier and mid-tier applications typically have many threads of execution, which take turns using RDBMS resources. Currently, multi-threaded applications can share connections to the database efficiently, allowing great mid-tier scalability. Starting with Oracle 11g, application developers and administrators and DBAs can use Database Resident Connection Pooling to achieve such scalability by sharing connections among multi-process as well as multi-threaded applications that can span across mid-tier systems.
DRCP provides a connection pool in the database server for typical Web application usage scenarios where the application acquires a database connection, works on it for a relatively short duration, and then releases it. DRCP pools "dedicated" servers. A pooled server is the equivalent of a server foreground process and a database session combined.
DRCP complements middle-tier connection pools that share connections between threads in a middle-tier process. In addition, DRCP enables sharing of database connections across middle-tier processes on the same middle-tier host and even across middle-tier hosts. This results in significant reduction in key database resources needed to support a large number of client connections, thereby reducing the database tier memory footprint and boosting the scalability of both middle-tier and database tiers. Having a pool of readily available servers also has the additional benefit of reducing the cost of creating and tearing down client connections.
DRCP is especially relevant for architectures with multi-process single threaded application servers (such as PHP/Apache) that cannot perform middle-tier connection pooling. The database can still scale to tens of thousands of simultaneous connections with DRCP.
DRCP stands for Database Resident Connection Pooling as opposed to "non-pooled" connections
In short, with DRCP, Oracle will cache all the connections opened, making a pool out of them, and will use the connections in the pool for future requests.
The aim of this is to avoid that new connections are opened if some of the existing connections are available/free, and thus to safe database ressources and gain time (the time to open a new connection).
If all connections in the pool are being used, then a new connection is automatically created (by Oracle) and added to the pool.
In non pooled connections, a connection is created and (in theory) closed by the application querying a database.
For instance, on a static PHP page querying the database, you have always the same scheme :
Open DB connection
Queries on the DB
Close the DB connection
And you know what your scheme will be.
Now suppose you have a dynamic PHP page (with AJAX or something), that will query the database only if the user makes some specific actions, the scheme becomes unpredictable. There DRCP can become healthy for your database, especially if you have a lot of users and possible requests.
This quote from the official doc fairly summarize the concept and when it should be used :
Database Resident Connection Pool (DRCP) is a connection pool in the
server that is shared across many clients. You should use DRCP in
connection pools where the number of active connections is fairly less
than the number of open connections. As the number of instances of
connection pools that can share the connections from DRCP pool
increases, the benefits derived from using DRCP increases. DRCP
increases Database server scalability and resolves the resource
wastage issue that is associated with middle-tier connection pooling.
DRCP increases the level of "centralization" of the pools:
Classic connection pool are managed within the client middleware. This means that if for instance you have several independent web servers, likely each one will have their own server-managed connection pool. There is a pool per server and the server is responsible for managing it. For instance you may have 3 separate pools with a limit of 50 connections per pool. Depending on usage patterns it may be a waste, because you may end up using the total 150 connection very seldom, and on the other hand you may hit the individual limit of 50 connections very often.
DRCP is a single pool managed by the DB server, not the client servers. This can lead to more efficient distribution of the connections. In the example above, the 3 servers may share the same pool, database-managed, of less than 150 connections, say 100 connections. And if two servers are idle, the third server can take up all the 100 connections if needed.
See Oracle Database 11g: The Top New Features for DBAs and Developers for more details and About Database Resident Connection Pooling:
This results in significant reduction in key database resources needed to support a large number of client connections, thereby reducing the database tier memory footprint and boosting the scalability of both middle-tier and database tiers
In addition, DRCP compensates the complete lack of middleware connection pools in certain technologies (quoted again from About Database Resident Connection Pooling):
DRCP is especially relevant for architectures with multi-process single threaded application servers (such as PHP/Apache) that cannot perform middle-tier connection pooling. The database can still scale to tens of thousands of simultaneous connections with DRCP.
As a further reference see for instance Connection pooling in PHP - Stack Overflow for instance.
We have a Database fail over setup for BPM/WAS . DB2 is our Database.We have configured all the Automatic reroute connections and tuned the connection pool settings. But still, whenever the Database fail over happens, the apps are coming up extremely slow. (If we recyle the JVMs, the apps are coming up quickly. But without restarting the JVMs we would like to achieve the perfect Database failover and recovery scenario.) Please help me to fix this issue.
Thanks,
Kumar.
We're in the process of rewriting a web application in Java, coming from PHP. I think, but I'm not really sure, that we might run into problems in regard to connection pooling. The application in itself is multitenant, and is a combination of "Separate database" and "Separate schema".
For every Postgres database server instance, there can be more than 1 database (named schemax_XXX) holding more than 1 schema (where the schema is a tenant). On signup, one of two things can happen:
A new tenant schema is created in the highest numbered schema_XXX database.
The signup process sees that a database has been fully allocated and creates a new schemas_XXX+1 database. In this new database, the tenant schema is created.
All tenants are known via a central registry (also a Postgres database). When a session is established the registry will resolve the host, database and schema of the tenant and a database session is established for that HTTP request.
Now, the problem I think I'm seeing here is twofold:
A JDBC connection pool is defined when the application starts. With that I mean that all databases (host+database) are known at startup. This conflicts with the signup process.
When I'm writing this we have ~20 database servers with ~1000 databases (for a total sum of ~100k (tenant) schemas. Given those numbers, I would need 20*1000 data sources for every instance of the application. I'm assuming that all pools are also, at one time or another, also started. I'm not sure how much resources a pool allocates, but it must be a non trivial amount for 20k pools.
So, is it feasable to even assume that a connection pool can be used for this?
For the first problem, I guess that a pool with support for JMX can be used, and that we create a new datasource when and if a new schemas_XXX database is created. The larger issue is that of the huge amount of pools. For this, I guess, some sort of pool manager should be used that can terminate a pool that have no open connections (and on demand also start a pool). I have not found anything that supports this.
What options do I have? Or should I just bite the bullet and fall back to an out of process connection pool such as PgBouncer and establish a plain JDBC connection per request, similar to how we're handling it now with PHP?
A few things:
A Connection pool need not be instantiated only at application start-up. You can create or destroy them whenever you want;
You obviously don't want to eagerly create one Connection pool per database or schema to be open at all times. You'd need to keep at least 20000 or 100000 Connections open if you did, a nonstarter even before you get to the non-Connection resources used by the DataSource;
If, as is likely, requests for Connections for a particular tenant tend to cluster, you might consider lazily, dynamically instantiating pools, and destroying them after some timeout if they've not handled a request for a while.
Good luck!
I am using Pentaho-BI server installation in my web application as a third party installation.I am using its saiku analytics and reporting files by embedding their specific links in iframe of my application. Problem is I am not getting how it creates database connections, in terms of numbers?? Because many times it throws error regarding 'No connection is available in pool'. I know there are properties like max available connection, max idle connections , wait and sql validation. But How to release connections?? And if Pentaho handles it in its own way then how?? Because increasing number of max connections available will create load on database server, when many users are using my BI server.
One solution I found is just to restart my BI server, but It's not a valid solution for production environment. Other solution I think is scheduler, but I have no clues about it and not getting proper info on net.
The defaults for max connections are incredibly low. This is standard tomcat connection pooling stuff, I would definitely try increasing the default, see if that helps. you can monitor concurrent connections on the db side - just because you have 100 connections to the db it doesn't necessarily mean they'll be all used at once.
Also; Are you using mysql? You should try the c3po pooling driver it handles timeouts and things better than the standard driver so you shouldnt ever get dead connections sitting in the pool.