Problems with Spring and Hibernate SessionFactory: Domain object scope restricted to session - spring

I have been using the session factory (Singleton Bean injected into the DAO objects) in my Spring/Hibernate application, I am using the service layers architecture, and I have the following issue:
Anytime I get a domain object from the database, it uses a new session provided by the hibernate session factory. In the case of requesting several times the same row, this leads to having multiple instances of that same domain object. (If using a single session, it would return multiple objects pointing to the same reference) Thus, any changes made to one of those domain object is not taken into account by the other domain objects representing this same row.
I am developing a SWING application with multiple views and I get the same DB row from different locations (And queries), and I thus need to obtain domain objects pointing to the same instance.
My question is then, Is it a way to make this happen using the SessionFactory? If not, is it a good practice to use a single session for my whole application? In that case, how and where should I declare this session? (Should it be a bean injected into the DAO objects just like the sessionFactory?)
Thank you in advance for your help

Hibernate session (I will call it h-session) in Spring usually bound to thread (see JavaDoc for HibernateTransactionManager), so h-session acquired once per thread.
First level cache (h-session cache - always turned on) used to retrieve same object if you call "get" or "load" several times on one h-session. But this cache doesn't work for queries.
Also, you shouldn't forget about problems related to transaction isolation. In most applications "Read committed" isolation level is used. And this isolation level affected by phenomenon known as "non-repeatable reads". Basically, you could receive several versions of the same row in one transaction if you query for this row several times (because row could be updated between queries in another transaction).
So, you shouldn't query several times for same data in one h-session/transaction.

You're looking for the Open Session in View Pattern. Essentially, you want to bind a Session to your thread on application startup and use the same Session throughout the lifetime of the application. You can do this by creating a singleton util class which keeps a session like so (note that the example I have uses an EntityManager instead of a Session, but your code will be essentially the same):
private static EntityManager entityManager;
public static synchronized void setupEntityManager() {
if (entityManager == null) {
entityManager = entityManagerFactory.createEntityManager();
}
if (!TransactionSynchronizationManager.hasResource(entityManagerFactory)) {
TransactionSynchronizationManager.bindResource(entityManagerFactory, new EntityManagerHolder(entityManager));
}
}
public static synchronized void tearDownEntityManager() {
if (entityManager != null) {
if (entityManager.isOpen()) {
entityManager.close();
}
if (TransactionSynchronizationManager.hasResource(entityManagerFactory)) {
TransactionSynchronizationManager.unbindResource(entityManagerFactory);
}
if (entityManagerFactory.isOpen()) {
entityManagerFactory.close();
}
}
}
Note that there are inherent risks associated with the Open Session in View pattern. For example, I noticed in the comments that you intend to use threading in your application. Sessions are not threadsafe. So you'll have to make sure you aren't trying to access the database in a threaded manner.*
You'll also have to be more aware of your fetching strategy for collections. With an open session and lazy loading there's always the chance that you'll put undue load on your database.
*I've used this approach in a NetBeans application before, which I know uses threading for certain tasks. We never had any problems with it, but you need to be aware of the risks, of which there are many.
Edit
Depending on your situation, it may also be possible to evict your domain objects from the Session and cache the detached objects for later use. This strategy would of require that your domain objects not be updated very often, otherwise your application would become unnecessarily complicated.

Related

Make sure that data is loaded before the application startup | Spring webflux

I have a spring webflux application.
I am loading some list from database into bean. I have two ways of implementing the loading of this bean.
Approach 1: Reactive Way
#Bean
public List<Item> getItemList() throws IOException {
List<Item> itemList = new ArrayList<>();
itemRespository.findAll().collectList().subscribe(itemList::addAll);
return itemList;
}
Approach 2 : Blocking way
#Bean
public List<Item> getItemList() throws IOException {
List<Item> itemList = itemRespository.findAll().collectList().block();
return itemList;
}
Now as I want my application to be reactive, I don't want to use the blocking way.
But the endpoints which I am exposing through my controller depends on this bean's data.
#RestController
public class SomeController{
#Autowired
private List<items> getItemList;
#GetMapping('/endpoint')
public void process(){
List list = getItemList; //this may not get initialzed as the bean loading is reactive
//some more code
}
}
So in case of reactive approach, it may happen that somebody may call my endpoint(as application has already started and ready to serve requests), while due to some reason it may happened that my list has yet not bean retrieved from database(may be any reason ex: slowness of database server etc.), producing inconsistent results for the users calling my endpoint(which in turns depend on this bean).
I am looking for a solution for this scenario.
EDIT : More precise question is that should I load those beans reactively in my application, on which my exposed endpoints are dependent?
The current application architecture solution presented is a typical example on a design that is inherently blocking.
If the first request made to the api needs the items to be in place, then we must sure that they are there before we can take on requests. And the only way to ensure that is to block until the items de facto have been fetched and stored.
Since the design is inherently blocking, we need to rethink our approach.
What we want is to make the service available for requests as quick as possible. We can solve this by using a cache, that will get filled when the first request is made.
Which means application starts up with an empty cache. This cache could for instance be a #Component as spring beans are singletons by default.
the steps would be:
service starts up, cache is empty
service receives its first request
checks if there is data in the cache
if data is stale, evict the cache
if cache is empty, fetch the data from our source
fill the cache with our fetched data
set a ttl (time to live) on the data placed in the cache
return the data to the calling client
Second request:
request comes in to the service
checks if there is data in the cache
checks if the data is stale
if not grab the data and return it to the calling subscriber
There are several cache solutions out there, spring has their #Cachable annotation, which by default is just a key value store, but can be paired with an external solution like redis etc.
Other solutions can be Google guava which has a very good read on their github.
This type of solution is called trading memory for cpu we gain startup time and fast requests (cpu), but the cost is we will spend some more memory to hold data in a cache.

optimizing findAll in spring Data JPA

I have a table which has a list of lookup values max 50 rows.
Currently, I am querying this table every time to look for a particular value which is not efficient.
So, I am planning to optimize this by loading all the value at once as a List from the repository using findAll.
List<CardedList> findAll();
My question here is
Class A -> Class B - Class B which holds this repository. Will it query findAll everytime when Class A calls Class B?
Class A {
//foreach items in the list call Class B
b.someMethod();
}
Class B {
#Autowired
CardedListRepository cardRepo;
someMethod() {
cardRepo.findAll();
}
}
What is the best way to achieve this?
If it is just 50 rows you could cache them in an instance variable of a service and check like this:
Class B {
#Autowired
CardedListRepository cardRepo;
List<CardedList> cardedList = new ArrayList<>();
someMethod() {
if(cardedList.isEmpty())
{
cardedList = cardRepo.findAll();
}
// do others in someMethod
}
The proposed "solution" by #Juliyanage Silva (to "cache" the findAll query result as a simple instance variable of service B) can be very dangerous and should not be implemented before checking very carefully that it works under all circumstances.
Just imagine the same service instance being called from a subsequent transaction - you would end up with a (probably outdated) list of detached entities.
(e.g. leading to LazyInitializationExceptions when accessing not initialized properties, etc.)
Hibernate already provides several caching mechanisms, as e.g. a standard first level cache, which avoids unnecessary DB round trips when looking for an already loaded entity by ID within the same transaction.
However, query results (as from findAll) are not cached by default, as explained in the documentation:
Caching of query results introduces some overhead in terms of your applications normal transactional processing. For example, if you cache results of a query against Person, Hibernate will need to keep track of when those results should be invalidated because changes have been committed against any Person entity.
That, coupled with the fact that most applications simply gain no benefit from caching query results, leads Hibernate to disable caching of query results by default.
To enable the Hibernate query cache, the second level cache needs to be configured. To prevent ending up with stale entries when having multiple application instances, this calls for a distributed cache (like Hazelcast or EhCache).
There are also various discussions on using springs caching mechanisms for this purpose. However, there are also various pitfalls when it comes to caching collections. And when running multiple application instances you may need a distributed cache or another global invalidation mechanism, too.
How to add cache feature in Spring Data JPA CRUDRepository
Spring Cache with collection of items/entities
Spring Caching not working for findAll method
So depending on your use-case, it may be the easiest to just avoid unnecessary calls of service B by storing the result in a local variable within the calling method of service A.

Inject Session object to DAO bean instead of Session Factory?

In our application we are using Spring and Hibernate.
In all the DAO classes we have SessionFactory auto wired and each of the DAO methods are calling getCurrentSession() method.
Question I have is why not we inject Session object instead of SessionFactory object in prototype scope? This will save us the call to getCurrentSession.
I think the first method is correct but looking for concrete scenarios where second method will throw errors or may be have bad performance?
When you define a bean as prototype scope a new instance is created for each place it needs to be injected into. So each DAO will get a different instance of Session, but all invocations of methods on the DAO will end up using the same session. Since session is not thread safe it should not be shared across multiple threads this will be an issue.
For most situations the session should be transaction scope, i.e., a new session is opened when the transaction starts and then is closed automatically once the transaction finishes. In a few cases it might have to be extended to request scope.
If you want to avoid using SessionFactory.currentSession - then you will need to define your own scope implementation to achieve that.
This is something that is already implemented for JPA using proxies. In case of JPA EntityManager is injected instead of EntityManagerFactory. Instead of #Autowired there is a new #PersistenceContext annotation. A proxy is created and injected during initialization. When any method is invoked the proxy will get hold of the actual EntityManager implementation (using something similar to SessionFactory.getCurrentSession) and delegate to it.
Similar thing can be implemented for Hibernate as well, but the additional complexity is not worth it. It is much simpler to define a getSession method in a BaseDAO which internally call SessionFactory.getCurrentSession(). With this the code using the session is identical to injecting session.
Injecting prototype sessions means that each one of your DAO objects will, by definition, get it's own Session... On the other hand SessionFactory gives you power to open and share sessions at will.
In fact getCurrentSession will not open a new Session on every call... Instead, it will reuse sessions binded to the current session context (e.g., Thread, JTA Transacion or Externally Managed context).
So let's think about it; assume that in your business layer there is a operation that needs to read and update several database tables (which means interacting, directly or indirectly, with several DAOs)... Pretty common scenario right? Customarily when this kind of operation fails you will want to rollback everything that happened in the current operation right? So, for this "particular" case, what kind of strategy seems appropriate?
Spanning several sessions, each one managing their own kind of objects and bound to different transactions.
Have a single session managing the objects related to this operation... Demarcate the transactions according to your business needs.
In brief, sharing sessions and demarcating transactions effectively will not only improve your application performance, it is part of the functionality of your application.
I would deeply recommend you to read Chapter 2 and Chapter 13 of the Hibernate Core Reference Manual to better understand the roles that SessionFactory, Session and Transaction plays within the framework. It will also teach will about Units of work as well as popular session patterns and anti-patterns.

Spring,Hibernate - Batch processing of large amounts of data with good performance

Imagine you have large amount of data in database approx. ~100Mb. We need to process all data somehow (update or export to somewhere else). How to implement this task with good performance ? How to setup transaction propagation ?
Example 1# (with bad performance) :
#Singleton
public ServiceBean {
procesAllData(){
List<Entity> entityList = dao.findAll();
for(...){
process(entity);
}
}
private void process(Entity ent){
//data processing
//saves data back (UPDATE operation) or exports to somewhere else (just READs from DB)
}
}
What could be improved here ?
In my opinion :
I would set hibernate batch size (see hibernate documentation for batch processing).
I would separated ServiceBean into two Spring beans with different transactions settings. Method processAllData() should run out of transaction, because it operates with large amounts of data and potentional rollback wouldnt be 'quick' (i guess). Method process(Entity entity) would run in transaction - no big thing to make rollback in the case of one data entity.
Do you agree ? Any tips ?
Here are 2 basic strategies:
JDBC batching: set the JDBC batch size, usually somewhere between 20 and 50 (hibernate.jdbc.batch_size). If you are mixing and matching object C/U/D operations, make sure you have Hibernate configured to order inserts and updates, otherwise it won't batch (hibernate.order_inserts and hibernate.order_updates). And when doing batching, it is imperative to make sure you clear() your Session so that you don't run into memory issues during a large transaction.
Concatenated SQL statements: implement the Hibernate Work interface and use your implementation class (or anonymous inner class) to run native SQL against the JDBC connection. Concatenate hand-coded SQL via semicolons (works in most DBs) and then process that SQL via doWork. This strategy allows you to use the Hibernate transaction coordinator while being able to harness the full power of native SQL.
You will generally find that no matter how fast you can get your OO code, using DB tricks like concatenating SQL statements will be faster.
There are a few things to keep in mind here:
Loading all entites into memory with a findAll method can lead to OOM exceptions.
You need to avoid attaching all of the entities to a session - since everytime hibernate executes a flush it will need to dirty check every attached entity. This will quickly grind your processing to a halt.
Hibernate provides a stateless session which you can use with a scrollable results set to scroll through entities one by one - docs here. You can then use this session to update the entity without ever attaching it to a session.
The other alternative is to use a stateful session but clear the session at regular intervals as shown here.
I hope this is useful advice.

object session in playframework

How I can store an instance object foreach user session?
I have a class to modeling a complex algorithm. This algorithm is designed to run step-by-step. I need to instantiate objects of this class for each user. Each user should be able to advance step by step their instance.
You can only store the objects in the Cache. The objects must be serializable for this. In the session you can store a key (which must be a String) to the Cache. Make sure that your code still works if the object was removed from the cache (same as a session-timeout). It's explained in http://www.playframework.org/documentation/1.0.3/cache.
Hope that solve your problem.
To store values in the session:
//first get the user's session
//if your class extends play.mvc.Controller you can access directly to the session object
Session session = Scope.Session.current();
//to store values into the session
session.put("name", object);
If you want to invalidate / clear the session object
session.clear()
from play documentation: http://www.playframework.org/documentation/1.1.1/cache
Play has a cache library and will use Memcached when used in a distributed environment.
If you don’t configure Memcached, Play will use a standalone cache that stores data in the JVM heap. Caching data in the JVM application breaks the “share nothing” assumption made by Play: you can’t run your application on several servers, and expect the application to behave consistently. Each application instance will have a different copy of the data.
You can put any object in the cache, as in the following example (in this example from the doc http://www.playframework.org/documentation/1.1.1/controllers#session, you use session.getId() to save messages for each user)
public static void index() {
List messages = Cache.get(session.getId() + "-messages", List.class);
if(messages == null) {
// Cache miss
messages = Message.findByUser(session.get("user"));
Cache.set(session.getId() + "-messages", messages, "30mn");
}
render(messages);
}
Because it's a cache, and not a session, you have to take into account that the data might no longer be available, and have some mean to retrieve it once again from somehere (the Message model, in this case)
Anyway, if you have enough memory and it involves a short interaction with the user the data should be there, and in case it's not you can redirect the user to the beginning of the wizard (you are talking about some kind of wizard page, right?)
Have in mind that play, with it's stateless share-nothing approach, really have no sessión at all, underneath it just handles it through cookies, that's why it can only accept strings of limited size
Here's how you can save "objects" in a session. Basically, you serialize/deserialize objects to JSON and store it in the cookie.
https://stackoverflow.com/a/12032315/82976

Resources