In my project, developers use a single instance of Connection instead of a connection pool on an Oracle 12c.
Using a pool is a common practice and Oracle itself documents it: http://docs.oracle.com/database/121/JJUCP/get_started.htm#JJUCP8120.
But JDBC 4.2 specification says:
13.1.1 Creating Statements
Each Connection object can create multiple Statement objects that may be used concurrently by the program.
Why using a pool of connections instead of a single connection, if it's possible to use statements to manage concurrency?
The Oracle Database Dev Team strongly discourages using a single Connection in multiple threads. That almost always causes problems. As a general rule we will not consider any problem report that does this.
A Connection can have multiple Statements and/or ResultSets open at one time but only one can execute at a time. Connections are strictly single threaded and blocking. We try to prevent multiple threads from accessing a Connection simultaneously but there are a few odd cases where it is possible. These are all but guaranteed to cause problems. (It is not practical to fix or prevent these cases mostly for performance reasons. Just don't share a single Connection across multiple threads.)
If a client connects to the database via a dedicated server connection then that database session will only serve that client . If the client connects to the database via shared server connection, then a given database session may serve multiple clients over its lifetime.
This is documented here.
Also, at any one point in time, a session can only execute one thing at a time. If that wasn't the case, then running things in parallel wouldn't spawn multiple other sessions!
A single connection cannot execute several statements concurrently.
Yes one connection can execute more that one statement. It will be the programmer to chose connection pooling setting or multiple statements when executing over more than one thread. Most databases in the market can handle multiple statements in one connection.
Related
need some software architecture insight on this. Which of the following is more efficient in terms of resource (cpu, memory, database)?
Having a single database connection in one flow? (Close connection only after everything is done, including business logic)
Having multiple database connections in one flow? (Open then close the database connection immediately after the query is executed)
By business logic, this is where data returned from the query is sanitized, or manipulated according to business rules.
Attaching here is the diagram for visual representation.
UPDATE:
Programming language: PHP (Laravel for web app, Lumen for API)
Database: MySQL
Host: AWS
opening new connection between your runtime and your database needs your OS to create new socket ( if runtime and database are on the same system this socket probably is linux socket, else this socket is tcp/udp socket)
this socket creation has overhead itself.
so I don't suggest to open and close connections after each database usage.
but there are specific conditions you want do that.
for example your database has a limited number of concurrent connections and you have thousands of long running processes using this connections, maybe in this situation you can use second approach.
Is there any problem on using many open connections at the same time from different threads?
From what I've read it's thread safe by default, but, can this be hurting performance rather than improving it?
Having multiple connection is not a problem, the only thing to keep in mind is that SQLite does not support concurrency of multiple write transactions. From the SQlite site:
SQLite supports an unlimited number of simultaneous readers, but it will only allow one writer at any instant in time. For many situations, this is not a problem. Writer queue up. Each application does its database work quickly and moves on, and no lock lasts for more than a few dozen milliseconds. But there are some applications that require more concurrency, and those applications may need to seek a different solution.
SQLite is an "untypical" database management system: in practice it is a library that offers SQL as language to access a simple "database-in-a-file", and a few other functionalities of DBMSs. For instance, it has no real concurrency control (it uses the Operating Systems functions to lock the db file).
So, if you need concurrent insertions into a database, you should use something else, for instance PostgreSQL.
The documentation say:
A connection can only be used from within the thread that created it.
Moving connections between threads or creating queries from a
different thread is not supported.
In addition, the third party libraries used by the QSqlDrivers can
impose further restrictions on using the SQL Module in a multithreaded
program. Consult the manual of your database client for more
information.
It is mean you have to create connection to database which will be linking with parent thread. At docs of QSqlDatabase class you can see description:
The QSqlDatabase class represents a connection to a database.
The QSqlDatabase class provides an interface for accessing a database
through a connection. An instance of QSqlDatabase represents the
connection. The connection provides access to the database via one of
the supported database drivers, which are derived from QSqlDriver.
Create a connection (i.e., an instance of QSqlDatabase) by calling one
of the static addDatabase() functions, where you specify the driver or
type of driver to use (i.e., what kind of database will you access?)
and a connection name.
Using static addDatabase() function is way to create connection.
But as Renzo said SQLite does not support multiple write transactions at the same time. So you need some mechanisms(wrapper) for synchronizing threads like task queue using low-level mutex or something like that. More information you can see at docs.
We're in the process of rewriting a web application in Java, coming from PHP. I think, but I'm not really sure, that we might run into problems in regard to connection pooling. The application in itself is multitenant, and is a combination of "Separate database" and "Separate schema".
For every Postgres database server instance, there can be more than 1 database (named schemax_XXX) holding more than 1 schema (where the schema is a tenant). On signup, one of two things can happen:
A new tenant schema is created in the highest numbered schema_XXX database.
The signup process sees that a database has been fully allocated and creates a new schemas_XXX+1 database. In this new database, the tenant schema is created.
All tenants are known via a central registry (also a Postgres database). When a session is established the registry will resolve the host, database and schema of the tenant and a database session is established for that HTTP request.
Now, the problem I think I'm seeing here is twofold:
A JDBC connection pool is defined when the application starts. With that I mean that all databases (host+database) are known at startup. This conflicts with the signup process.
When I'm writing this we have ~20 database servers with ~1000 databases (for a total sum of ~100k (tenant) schemas. Given those numbers, I would need 20*1000 data sources for every instance of the application. I'm assuming that all pools are also, at one time or another, also started. I'm not sure how much resources a pool allocates, but it must be a non trivial amount for 20k pools.
So, is it feasable to even assume that a connection pool can be used for this?
For the first problem, I guess that a pool with support for JMX can be used, and that we create a new datasource when and if a new schemas_XXX database is created. The larger issue is that of the huge amount of pools. For this, I guess, some sort of pool manager should be used that can terminate a pool that have no open connections (and on demand also start a pool). I have not found anything that supports this.
What options do I have? Or should I just bite the bullet and fall back to an out of process connection pool such as PgBouncer and establish a plain JDBC connection per request, similar to how we're handling it now with PHP?
A few things:
A Connection pool need not be instantiated only at application start-up. You can create or destroy them whenever you want;
You obviously don't want to eagerly create one Connection pool per database or schema to be open at all times. You'd need to keep at least 20000 or 100000 Connections open if you did, a nonstarter even before you get to the non-Connection resources used by the DataSource;
If, as is likely, requests for Connections for a particular tenant tend to cluster, you might consider lazily, dynamically instantiating pools, and destroying them after some timeout if they've not handled a request for a while.
Good luck!
I'm developing a large scale website and when I was checking linq queries with SQL Profiler I found that there is about 30 login/logout actions. I want to decrease these actions by using an open connection for all DataContexts but I don't know how to do it. Do you have any suggestion?
That is pretty bad idea - you are creating large scalable website so let Linq-to-sql handle connection itself.
Linq-to-sql internally handles connection opening and releasing in effective way for it usage. It uses default ADO.NET connection pooling so the connection is correctly reused and not opened for every single context.
Using single connection for all context is exactly what makes your application non scalable and not working. Single connection allows only single transaction so once two requests want to make concurrent changes and use their own transaction your application will crash.
Do not share contexts and do not share connections - let ADO.NET handle connection pooling and create new context for every single request or you can expect serious issues.
I'm seeing OraclePreparedStatement executeQuery() exhibit serialization. That is, I have two queries that I want to run concurrently against an Oracle database, using the same connection. However, the OraclePreparedStatement seems to explicitly prohibit concurrent queries.
My question is: Is this serialization a necessary artifact of running both queries on the same connection, or is this configurable?
I've tried setting readOnly to true for the duration of the two queries, but they still serialize.
I believe that Oracle's Connection class methods are synchronized. see this api description. This would then be an artifact of using the same connection, and not a configurable property. If you need to get around this limitation, you can either use 2 connections or look into connection pooling if you want a more flexible solution.