Using MongoDB API in a Web application - mongodb-java

I'd need to use the MongoClient and DB objects repeatedly in a Web application:
MongoClient mongoClient = new MongoClient();
DB db = mongoClient.getDB( "test" );
Is it safe to cache and re-use these objects among different clients accessing our application?
Thanks

You should create this once and inject it via CDI/Guice, if you can. If you can't do that, you could use a static factory method to return the one instance of your MongoClient. MongoClient maintains a connection pool and is safe to use between different threads. If you create a new MongoClient with each request, not only is it going to be a performance hit to set up that pool and open a new connection, but you'll likely leave dangling connections unless you properly close that MongoClient at the end of the request.

Yes. From Getting Started with Java Driver, "you will only need one instance of class MongoClient even with multiple threads".
As a side note, the Mongo Java driver is a pain to use. The dev team I'm part of is very happy with Jongo, a wrapper around the Java driver that allows queries to be written more like shell queries.

Related

While initializing RestHighLevel client do we always need to use new RestHighLevel client and new host?

RestHighLevelClient client =
new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http")));
My question is do we always need to use new HttpHost ? Is there a way where we can maintain a pool of connections ?
We are having GET and POST apis and client.search() is always called when there is a request triggered from front end. So everytime a new request is triggered a new connection is created. Is there a better approach of handling it. ?
You don't need to create multiple RestHighLevelClient in your application as mentioned in the official doc JHLRC is a thread safe client and typically instantiated by the application at startup and can be used by all the requests in your application.
from the same doc
The RestHighLevelClient is thread-safe. It is typically instantiated
by the application at startup time or when the first request is
executed.
Internally JHLRC maintains the pool of connection also does the load-balancing between Elasticsearch nodes, so you don't have to worry about these internals.

Which connection pool implementation has the behaviour that i want?

So i am running a spring boot server which i use to query a MySQL database. So far i have been using the auto-configured HikariCP connection pool with JOOQ so i had almost nothing to do with the connection pool. But now i need to query two different schemas (on the same server) and it seems like i can't auto-configure two connection pools so i have to tinker with the DataSource myself. I would like to conserve the native behavior of the connection, i.e have a set of persistent connections so that the server can dispatch the queries and once the query is resolved, the connection is still there and free to use again. I have found multiple implementations of connection pools allowing to have multiple DataSource to query multiple servers but i don't know if each of them is using the behavior that i just described.
Implementation #1 :
https://www.ru-rocker.com/2018/01/28/configure-multiple-data-source-spring-boot/
Implementation #2 :
https://www.stubbornjava.com/posts/database-connection-pooling-in-java-with-hikaricp
I feel like #2 is the most straight forward solution but i am sceptical to the idea of creating a new DataSource everytime i want to query. If i don't close it, am i just opening now connections over and over again? So obviously i would have to close them once finished but then it's not really a connection pool anymore. (Or am i misunderstanding this?)
Meanwhile #1 seems more reliable but again, i would be calling new HikariDataSource everytime so is that what i am looking for?
(Or is there a more simple solution that i have been missing out because i need to query two different schemas but still on the same server and dialect)
Ok so it turns out i don't have to setup multiple connections in my case. As i am querying the same server with the same credentials, i don't have to setup a connection for each shema. I just removed the schema that i specified in my jdbc url config:
spring.datasource.url=jdbc:mysql://localhost:5656/db_name?useUnicode=true&serverTimezone=UTC
Becomes
spring.datasource.url=jdbc:mysql://localhost:5656/?useUnicode=true&serverTimezone=UTC
And then as i had already generated the POJO with the JOOQ generator i could reference my table from the schema object, i.e: Client.CLIENT.ID.as("idClient") becomes ClientSchema.CLIENTSCHEMA.CLIENT.ID.as("idClient"). This way i can query multiple schemas without setting up any new additional connection.
How to configure MAVEN and JOOQ to generate sources from multiple schemas:
https://www.jooq.org/doc/3.13/manual/code-generation/codegen-advanced/codegen-config-database/codegen-database-catalog-and-schema-mapping/

Using Spring Cloud Connector for Heroku in order to connect to multiple RedisLabs databases

I have a requirement for multiple RedisLabs databases for my application as described in their home page:
multiple dedicated databases in a plan
We enable multiple DBs in a single plan, each running in a dedicated process and in a non-blocking manner.
I rely on Spring Cloud Connectors in order to connect to Heroku (or Foreman in local) and it seems the RedisServiceInfoCreator class allows for a single RedisLabs URL i.e. REDISCLOUD_URL
Here is how I have configured my first redis connection factory:
#Configuration
#Profile({Profiles.CLOUD, Profiles.DEFAULT})
public class RedisCloudConfiguration extends AbstractCloudConfig {
#Bean
public RedisConnectionFactory redisConnectionFactory() {
PoolConfig poolConfig = ...
return connectionFactory().redisConnectionFactory("REDISCLOUD", new PooledServiceConnectorConfig(poolConfig));
}
...
How I am supposed to configure a second connection factory if I intend to use several redis labs databases?
Redis Cloud will set for you an env var only for the first resource in each add-on that you create.
If you create multiple resources in an add-on, you should either set an env var yourself, or use the new endpoint directly in your code.
In short the answer is yes, RedisConnectionFactory should be using Jedis in order to connect to your redis db. it is using jedis pool that can only work with a single redis endpoint. in this regard there is no different between RedisLabs and a basic redis.
you should create several connection pools to work with several redis dbs/endpoints.
just to extend, if you are using multiple dbs to scale, there is no need with RedisLabs as they support clustering with a single endpoint. so you can simple create a single db with as much memory as needed, RedisLabs will create a cluster for you and will scale your redis automatically.
if you app does require logical seperation, then creation multiple dbs is the right way to go.

How to safely and efficiently connect to a MongoDB replicaset instance with the C# Driver

I am using MongoDB with the C# driver and am wondering what is the most efficient yet safe way to create connections to the database.
Thread Safety
According to the Mongo DB C# driver documentation the MongoClient, MongoServer, MongoDatabase, MongoCollection and MongoGridFS classes are thread safe. Does this mean I can have a singleton instance of MongoClient or MongoDatabase?
The documentation also states that a connection pool is used for MongoClient, so the management of connections to MongoDB is abstracted from the MongoClient class anyway.
Example Scenario
Let's say I have three MongoDB instances in my replicaset; so I create MongoClient and MongoDatabase objects based upon the three server addresses for these instances. Can I create a static singleton for the database and client objects and use them across multiple requests simultaneously? What if one of the instances dies; if I cache the Mongo objects, how can I make sure this scenario is dealt with safely?
In my project I'm using a singleton MongoClient only, then get MongoServer and other stuff from MongoClient.
This is because what you said, the connection pool is in the MongoClient, I definitely don't want more than one connection pool. and here's what the document says:
When you are connecting to a replica set you will still use only one
instance of MongoClient, which represents the replica set as a whole.
The driver automatically finds all the members of the replica set and
identifies the current primary.
Actually the MongoClient is added to C# driver since 1.7, to represent the whole replica set and handle failover, load balancing stuff. Because MongoServer doesn't have the ability to to that. Thus you shouldn't cache MongoServer because once a server is offline you can't know it.
EDIT: Just had a look at the source code. I may have made a mistake. The MongoClient doesn't handle connection pool. the MongoServer does (at least until driver 1.7, haven't looked at the latest driver source yet). This makes sense because MongoServer represents a real Mongo instance. And one connection pool stores connections only to that server.

jConnect4 pooled connection does not work as documented

Official Sybase jConnect Programmers Reference suggests following way to use pooled connections:
SybConnectionPoolDataSource connectionPoolDataSource = new SybConnectionPoolDataSource();
...
Connection ds = connectionPoolDataSource.getConnection();
...
ds.close();
However getDataSource always causes exception. I decompiled SybConnectionPoolDataSource and found that the method call explicitly generates an error:
public Connection getConnection() throws SQLException
{
ErrorMessage.raiseError("JZ0S3", "getConnection()");
return null;
}
Does anyone have an idea why the documentation contradicts to the implementation?
I can't comment specifically for Sybase because 1) I don't use it and 2) your link doesn't work, but I can try to give you a theory based on my own experience maintaining a JDBC driver (Jaybird/Firebird JDBC) and looking at what some of the other implementations do.
The ConnectionPoolDataSource is probably the least understood part of the JDBC API. Contrary to what the naming suggests and how it has been implemented in some JDBC implementations this interface SHOULD NOT provide connection pooling and should not implement DataSource (or at least: doing that can lead to confusion and bugs; my own experience).
The javadoc of the ConnectionPoolDataSource is not very helpful, the javax.sql package documentation provides a little bit more info, but you really need to look at the JDBC 4.1 specification, Chapter 11 Connection Pooling to get a good idea how it should work:
[...] the JDBC driver provides an implementation of ConnectionPoolDataSource that the application server uses to build and manage the connection pool.
In other words: ConnectionPoolDataSource isn't meant for direct use by a developer, but instead is used by an application server for its connection pool; it isn't a connection pool itself.
The application server provides its clients with an implementation of the DataSource interface that makes connection pooling transparent to the client.
So the connection pool is made available to the user by means of a normal DataSource implementation. The user uses this as would it be one that doesn't provide pooling, and uses the connections obtained as if it is a normal physical connection instead of one obtained from a connection pool:
When an application is finished using a connection, it closes the logical connection using the method Connection.close. This closes the logical connection but does not close the physical connection. Instead, the physical connection is returned to the pool so that it can be reused.
Connection pooling is completely transparent to the client: A client obtains a pooled connection and uses it just the same way it obtains and uses a non pooled connection.
This is further supported by the documentation of PooledConnection (the object created by a ConnectionPoolDataSource):
An application programmer does not use the PooledConnection interface directly; rather, it is used by a middle tier infrastructure that manages the pooling of connections.
When an application calls the method DataSource.getConnection, it gets back a Connection object. If connection pooling is being done, that Connection object is actually a handle to a PooledConnection object, which is a physical connection.
The connection pool manager, typically the application server, maintains a pool of PooledConnection objects. If there is a PooledConnection object available in the pool, the connection pool manager returns a Connection object that is a handle to that physical connection. If no PooledConnection object is available, the connection pool manager calls the ConnectionPoolDataSource method getPoolConnection to create a new physical connection. The JDBC driver implementing ConnectionPoolDataSource creates a new PooledConnection object and returns a handle to it.
Unfortunately, some of JDBC drivers have created data sources that provide connection pooling by implementing both DataSource and ConnectionPoolDataSource in a single class, instead of the intent of the JDBC spec of having a DataSource that uses a ConnectionPoolDataSource. This has resulted in implementations that would work if used as a normal DataSource, but would break if used as a ConnectionPoolDataSource (eg in the connection pool of an application server), or where the interface was misunderstood and the wrong methods where used to create connections (eg calling getPooledConnection().getConnection()).
I have seen implementations (including in Jaybird) where the getPooledConnection() would be used to access a connection pool internal to the implementation, or where only connections obtained from the getConnection() of the implementation would work correctly, leading to all kinds of oddities and incorrect behavior when that implementation was used to fill a connection pool in an application server using the getPooledConnection().
Maybe Sybase did something similar, and then decided that wasn't such a good idea so they changed the DataSource.getConnection() to throw an exception to make sure it wasn't used in this way, but at the same time maintaining the API compatibility by not removing the methods defined by DataSource. Or maybe they extended a normal DataSource to easily create the physical connection (instead of wrapping a normal one), but don't want users to use it as a DataSource.

Resources