performance issue in find() method after migration to Hibernate 4.0 from OpenJPA 1.2 - performance

I migrate from OpenJPA 1.2 to Hiberante 4.0
I'm using TimesTen DB
I'm doing a native query to get id's of object's that I need , and then perform find on each on of them.
In OpenJPA instead of find I used findCache() method and if it return null I use the find() method , In hibernate I used only the find() method.
I performed this operation on the same DB.
after running couple of test I saw that the performance of OpenJPA is far better.
I printed the statistics of hibernate session ( after querying and finding the same object's) and saw that the hit\miss count to the first level cache is always 0.
while the OpenJPA is clearly reaching it's cache by fetching object's with the findCache method.
How can I improve the performance of find in Hibernate ?
I suspect it referred to the difference in the first level cache implementation of this tools.
another fact: I use the same EntityManager for the application run time ( I need to minimize the cost of creating of an EntityManager - my app is soft real time )
thanks.

Firstly, why don't you just retrieve the full objects instead of the id. One select statement to retrieve a number of objects is many magnitude times faster than retrieving each item individually.
Secondly, you likely need a second level cache for hibernate. The first level cache is mostly applicable within each session.

The first level cache in Hibernate corresponds to the session. So if the session has not yet loaded a given object, it will be a miss.
You need to enable second level cache to be able to cache an object by id across sessions.
Check out the reference documentation for more info http://docs.jboss.org/hibernate/orm/4.1/manual/en-US/html_single/#performance-cache

Related

Hibernate first level cache to hold entities found by a property that is not the ID

I am working on a Java 8 / Spring Boot 2 application and I have noticed that the security module of my app internally uses the findByEmail method of my UserRepostiory (which is a standard Spring Data JPA Repository). When I enabled Hibernate SQL logging, I discovered that these queries are performed multiple times within the same session (security uses it 3-4 times and then my business code uses it some more times). Each time the query hits the database.
This surprised me, as I expected it to be cached in the Hibernate's first level cache. After reading up about it a little bit more, I found out that the first level cache only caches the result of the findById query, not others.
Is there anyway that I can cache the result of the findByEmail query in the first level cache? (I don't want the cache to be shared between sessions, I don't want to use the 2nd level cache, as I think it should be invalidated right after the current session ends).
Yes, you can cache the results of a query on a unique property if you annotate the property with the #NaturalId annotation. If you then use the dedicated API to execute the query, the results will be stored in the 1st level cache. An example:
User user = entityManager
.unwrap(Session.class)
.bySimpleNaturalId(User.class)
.load("john#example.com");

spring jpa, findAll(Iterable<Integer> userIds)

findAll(ListU ..) in spring jpa is called by passing UUID's list of size n, when sql logs are enabled i found n number of sql getting logged ,which i assume DB is being called n times(size of list) to fetch records.
can it be possible to call DB only once to fetch all the records at once so that performance can be improved
Spring Data uses an IN close for this method except when your entity has a composite key. That is already just one query.
So the multiple queries you see is most likely your JPA implementation deciding to return a proxy with just the id and then lazy loading the attributes by demand. See the documentation of the implementation you are using for how to prevent/control that.

Spring data Oracle JPA performance with pagination

I am looking to retrieve a large dataset with a JpaRepository, backed by Oracle
table. The choices are to return a collection (List) or a Page of the entity and then step through the results. Please note - I have to consume every record in this set, exactly once. This is not a "look-for-the-first-one-from-a-large-dataset-and-return" operation.
While the paging idea is appealing, the performance will be horrible (n^2) because for each page queried, oracle will have to pull up previous n-1 pages, making the performance progressively worse as I get deeper in the result set.
My understanding of the List alternative is that the entire result-set will be loaded in memory. For oracle JPA spring does not have a backing result-set.
So here are my questions
Is my understanding of the way List works with Spring Data correct? If it's not then I will just use List.
If I am correct, is there an alternative that streams Oracle/JPA result-sets?
Is there a third way that I am not aware of.
Pageable methods in SDJ call additional select count(*) from ... every request. I think this is reason of the problem.
To avoid it you can use Slice instead of Page as return parameter, for example:
Slice<User> getAllBy(Pageable pageable);
Or you can use even List of entities with pagination:
List<User> getAllBy(Pageable pageable);
Additional info

Apply Laravel 4 query cache to all database reads

Laravel 4 has a query cache built into its query builder: just add ->remember(), according to the docs.
Can anybody tell me how I can apply this method to all queries in my application, without appending ->remember() to each and every database call in it? Some kind of after filter, I suppose.
You might be able to extend the query builder and simply overload the get() method to first call remember(), and then do the get() statement.
Practically, though, if you want to cache every single query, you might as well just do this at the database level. MySQL, for example, has a configuration option to automatically cache all queries for a certain amount of time. However, in an application that does a lot of inserts/updates/deletes, this will have poor performance since the cache is cleared for that table on every such call.
Using Laravel for every query would also mean getting outdated data if you do inserts/updates/deletes meanwhile, so you'd have to clear the cache every time you update.
Best practice would be to diligently decide if a query should be cached or not.

How to cache queries in EJB and return result efficient (performance POV)

I use JBoss EJB 3.0 implementation (JBoss 4.2.3 server)
At the beginning I created native query all the time using construction like
Query query = entityManager.createNativeQuery("select * from _table_");
Of couse it is not that efficient, I performed some tests and found out that it really takes a lot of time... Then I found a better way to deal with it, to use annotation to define native queries:
#NamedNativeQuery( name = "fetchData", value = "select * from _table_", resultClass=Entity.class )
and then just use it
Query query = entityManager.createNamedQuery("fetchData");
the performance of code line above is two times better than where I started from, but still not that good as I expected... then I found that I can switch to Hibernate annotation for NamedNativeQuery (anyway, JBoss's implementation of EJB is based on Hibernate), and add one more thing:
#NamedNativeQuery( name = "fetchData2", value = "select * from _table_", resultClass=Entity.class, readOnly=true)
readOnly - marks whether the results are fetched in read-only mode or not. It sounds good, because at least in this case of mine I don't need to update data, I wanna just fetch it for report. When I started server to measure performance I noticed that query without readOnly=true (by default it is false) returns result with each iteration better and better, and at the same time another one (fetchData2) works like "stable" and with time difference between them is shorter and shorter, and after 5 iterations speed of both was almost the same...
The questions are:
1) is there any other way to speed query using up? Seems that named queries should be prepared once, but I can't say it... In fact if to create query once and then just use it it would be better from performance point of view, but it is problematic to cache this object, because after creating query I can set parameters (when I use ":variable" in query), and it changes query object (isn't it?). well, is here any way to cache them? Or named query is the best option I can use?
2) any other approaches how to make results retrieveng faster. I mean, for instance I don't need those Entities to be attached, I won't update them, all I need is just fetch collection of data. Maybe readOnly is the only available way, so I can't speed it up, but who knows :)
P.S. I don't ask about DB performance, all I need now is how not to create query object all the time, so use it efficient, and to "allow" EJB to do less job with the same result concerning data returning.
Added 15.03.2010:
By query I mean query object (so how to cache this object to reuse); and to cache query results is not a solution for me because of where cause in query can be almost unique for each querying because of float-pointing parameters there. Cache just will not understand that "a > 50.0001" and "a > 50.00101" can give the same result, but also can not.
You could use second level cache and query cache to avoid hitting the database (works especially well with read-only objects). Second level cache is supported by Hibernate (with a third party cache provider) but is an extension to JPA 1.0 though.

Resources