Retrieve DataStax Session from CassandraOperations - spring-boot

Spring Boot Data Cassandra has removed the ability to retrieve a com.datastax.driver.core.Session from org.springframework.data.cassandra.core.CassandraOperations. I'm trying to rectify old code that has these usages. Is there a simply way to retrieving the cassandra session? I'm looking for a way to create a prepared statement from an Insert, with only access to an instance of CassandraOperations, e.g.
cassandraOperations.getSession().prepare(insert);

We've removed getSession() from CassandraOperations because of two reasons:
Interface split into CassandraOperations and CqlOperations. CassandraTemplate (which implements CassandraOperations) now uses CqlOperations as lower-level API.
We introduced SessionFactory to be able to route CQL calls into various Cassandra Sessions. CQL execution obtains a session from the configured SessionFactory. A session is considered valid during the execute call as the next command could be executed on a different session.
You can still obtain a Session. Either call:
CqlTemplate cqlTemplate = (CqlTemplate) cassandraTemplate.getCqlOperations();
cqlTemplate.getSession();
or obtain Session through Spring's context (autowiring, lookup via context.getBean(Session.class), …).

Related

How to turn off JPA for SpringBatch under SpringBoot

We have a Spring Boot application that uses Spring Integration and Spring Batch. We drop a file in the poller and it processes. This process inserts records into a database and then reads them back out does some processing and writes a file. Let's say there are 10 records. The first time we get 10 records read and 10 written. Without stopping the server, we delete all the records through a SQL client on the database, run the same file again and we get 10 records read with 20 written. I believe there is some JPA or caching going on with the datasource. We've tried turning off several auto configuration options for JPA and caching but we haven't found the right configuration option to turn off caching.
Adding a bit more detail to the question.
Basically we have cron scheduler that has a FileHandler. This the handleFile methods we have the following.
public File handleFile(File file) throws Throwable {
JobParametersBuilder jobParametersBuilder = new JobParametersBuilder();
Job job = (Job) appContext.getBean("processInitialFileJob");
JobExecution jb = jobLauncher.run(job, jobParametersBuilder.toJobParameters());
....
}
What can we do to the code above to ensure that it has a new JPA session or not use the JPA session at all? This job needs to read from the database each time and not a cached representation of the database.
Are u using Hibernate. Hibernate First Level cache may be creating the problem for u. Hibernate manages a First Level cache which is local to your Session. So once u create a session and do any transactions in that hibernate syncs that within. But when u do any changes to the table outside hibernate then hibernate wont sync that until flush is called on the session and session is closed.
To make sure this is not happening, inside your poller logic try creating new Session(or EntityManager in case of JPA) and close the session for every read/process/write cycle.
Also make sure this hibernate.current_session_context_class is not set to Thread. Since thread can be reused by the poller so the same Hibernate Session may be injected again.
This ended up not being an issue with Hibernate or JPA, but an issue of a StringBuilder holding on to data from previous runs. I believe this will need to be setup as #JobScope so that it is not reused across different executions of the job.

Found shared references to a collection: when using transactions in parallel from same user session

I am using hibernate 3.3 with spring 3.0.5 and using JPA transaction manager.
My scenario is a ui page invoking two get methods on service layer to render two regions in UI, Requests are parallel ajax requests. The get methods in service layer return two 'separate' lists of same entity.
List<Car> getCarsA();
List<Car> getCarsB();
I have configured JPA transaction manager as below:
<tx:method name="get*" read-only="true" propagation="REQUIRED"/>
Problem : when the hibernate/JPA calls flush after each service method is over, there is potentially some collection (via many-to-many mappings) which is shared between the two lists returned by the methods and hence the exception. I am also using OpenEntityManagerInViewFilter.
I do not run into this error if I call the methods serially.
The Hibernate session object is not thread-safe. You either must use an own session instance per thread or you must synchronise the access to the session instance with the java synchronized block.
In a web environment you should use at least one hibernate session per browser session. Access by the same browser session can be synchronized or also use more than one session (when fetching different regions of the UI concurrently as you do it, I would synchronise it with synchronized except if one query needs a quite long time where the other queries should not wait for).
Attention: Updates in one session are not directly visible in other session (for the case the instance already is cached in the other session).

Disabling EclipseLink cache

In my application, when user logs in to the system the system reads some setting from DB and stores them on user's session. The system is performing this action by a JPA query using EclipseLink (JPA 2.0).
When I change some settings in DB, and sign in again, the query returns the previous results. It seems that EclipseLink is caching the results.
I have used this to correct this behavior but it does not work:
query.setHint(QueryHints.cache_usage,cacheUsage.no_cache);
If you want to set query hints, the docs recommend doing:
query.setHint("javax.persistence.cache.storeMode", "REFRESH");
You can alternately set the affected entity's #Cacheable annotation
#Cacheable(false)
public class EntityThatMustNotBeCached {
...
}
If you're returning a some kind of configuration entity and want to be sure that data is not stale, you can invoke em.refresh(yourEntity) after returning the entity from query. This will force the JPA provider to get fresh data from the database despite the cached one.
If you want to disable the L2 cache you can use <shared-cache-mode>NONE</shared-cache-mode> within <persistence-unit> in persistence.xml or use Cacheable(false) directly on your configuration entity.
If you're returning plain fields instead of entities and still getting stale data you may try clearing the PersistenceContext by invoking em.clear().

Jdbc Connection Pooling - using multiple schema known at runtime only

I am working on an engine that is doing the following:
gets data provider info from DB (that tells me to what database & schema details to connect to get my data)
use that info to connect to the database and get my data, that later I use to build some XML content.
The standard setup to handle and isolate database connection management would be to create a DataSource bean (I'm using Spring to wire my components) and inject that in my ProviderConfigDao (loads connection config) and ContentDao (loads data using connection details loaded previously). This would nicely isolate the handling of the connections from the actual code, thus the DAO classes not needing to know how and when a connection is created/opened/closed etc.
This setup doesn't work unfortunately, as when I create my connection, I need to be able to specify the database schema. I don't know all the different schemas from the beginning, so I can't create a set of DataSource objects to cover all of them, thus the DataSource object must be created at runtime and it's creation hidden from the users.
The only solution I can think of is:
Have another class/interface (DataSourceProvider) having one method:
//Gets the connection URL as parameter (which includes the schema name).
DataSource getDataSource(String url);
Add a bean in Spring config to provide a custom implementation for it that manages creation of DataSource objects for each schema.
Inject that object to my DAO classes instead of the DataSource object.
It's not a bad solution, but I was wondering if there is maybe support for something like this already in some open source package ... I'd rather use something already done and tested then reinvent the wheel.
Cheers,
Stef.
there's a JDBC Utils to get all the metada from a database org.springframework.jdbc.support.JdbcUtils
parameters:
DataSource
Implementation of org.springframework.jdbc.support.DatabaseMetaDataCallback

EntityManager Lifecycle when using Oracle's Virtual Private Database

I had a few questions all related to the way an entity manager is created and used in an application with respect to Virtual Private Databases, which is a feature in Oracle DB which enables Row Level Security.
In a session bean, we generally have the entity manager as a member, and its generally injected by the container. How is this entity manager managed by the container - I mean, if we want to implement a Virtual Private Database then we have to make sure that the Virtual Private Database-context remains valid for the entire user session, and we do not have to set this context everytime before we fire a query. (to include more verbiage here : a session bean implements a couple of functions and each of these functions uses the same entity manager; now, it should not be the case that we set the Virtual Private Database everytime in each of these functions which do some DB manipulations).
Further to #1, since the entity manager is cached in the session bean, do we need to explicitly close the entity manager in any scenario? (like we do for JDBC connections?)
Also, I was wondering what should be the use case(or design criteria) for using a JTA or a non-JTA datasource. Is the way we create an entity manager dependant on this?
To add w.r.t the requirement on VPD:
It would be nice if the container managed EM can somehow be made to enforce the VPD per user. Note that EM is injected in here, so there should be a mechanism to set the VPD on the connection(and later retrieve the same connection for 'this' user in 'this' session).
Without an injected EM, i think using a reference to EMF and then setting the properties for the EM can be done. Something like :
((org.eclipse.persistence.internal.jpa.EntityManagerImpl)em.getDelegate()).setProperties
It would be an overkill, if the VPD is set everytime before the query is fired, rather the connection should 'maintain' the VPD context during the user's session and later release the connection (after clearing the VPD) back to the pool.
In a session bean, an injected entity manager is container managed and by default transaction scoped.
This means when you call any method on the session bean and a transaction is started, the persistence context of the entity manager starts. When the transaction is committed or rollbacked it ends. There is thus no scenario in which you have to explicitly close the entity manager.
Furthermore, when there already is a transaction in progress, this is joined by default and when there already is a persistence context attached to said transaction it's propagated instead of a new one being created.
Stateful session beans have another option, and that's the extended persistence context. This one is coupled to the scope of the stateful bean instead of to individual transactions. You still don't have to do any closing yourself here.
Then, you can also inject an EntityManagerFactory (using #PersistenceUnit) and then get an entity manager from it: In that case you'll have an application managed entity manager. In this case you'll have to explicitly close it.
JTA datasources (transactional datasources) are by default used with container managed entity managers. The container takes care of everything here. non-JTA datasources are for situations where you need separate connections to a DB, possibly outside any running transaction, on which you can set auto commit mode, commit, rollback, etc your self.
These two different datasource types can be defined in orm.xml for a persistence unit. If you define a persistence unit with a non-JTA datasource, you typically create an entity manager for it using a factory and then manage everything your self.
Update:
Regarding the Virtual Private Database, what you seem to need here is a user specific connection per entity manager, but the normal way of doing things is coupling a persistence unit to a general data source. I guess what could be needed here is a datasource that's aware of the user's context when a connection is requested.
If you completely bypass the container and even largely bypass the JPA abstraction, you could go directly to Hibernate. It has providers that you can register globally like DriverManagerConnectionProvider and DatasourceConnectionProvider. If you provide your own implementations for these with a setter for the actual connection, you can ask these back from a specific entity manager instance just prior to using it, and then set your own connection in it.
It's doable, but needless to say a bit hacky. Hopefully someone else can give a more 'official' answer. Best would of course if Oracle provided an official plug-in for e.g. EclipseLink to support this. This document hints that it does:
TopLink / EclipseLink : Support filtering data through their
#AdditionalCriteria annotation and XML. This allows an arbitrary JPQL
fragment to be appended to all queries for the entity. The fragment
can contain parameters that can be set through persistence unit or
context properties at runtime. Oracle VPD is also supported, include
Oracle proxy authentication and isolated data.
See also How to use EclipseLink JPA with Oracle Proxy Authentication.

Resources