I was testing the cache query feature of hibernate orm extension but I'm getting strange results looking at statistics:
Configuring a query with hint org.hibernate.cacheable = true does nothing (The query is always executed
For it to work, I have to define my entity as #Cacheable. Like this, the second time I call the query, I don't see it executed and only get the results
My question is:
Documentation seems to state that query cache is independent from entity cache. My point is, what are the implications of defining the entity Cacheable?
Will it affect only query by ID?
Can this be a bug?
It's not a bug, this is the regular behaviour of Hibernate ORM:
When running an entity query there's two aspects at play:
Caching the query -> which IDs of object to return
Caching each object - by ID
You can opt to cache 1 or 2, or both - depending on what you need.
Documented here; see in particular the info note after the examples.
Related
I am working on a Java 8 / Spring Boot 2 application and I have noticed that the security module of my app internally uses the findByEmail method of my UserRepostiory (which is a standard Spring Data JPA Repository). When I enabled Hibernate SQL logging, I discovered that these queries are performed multiple times within the same session (security uses it 3-4 times and then my business code uses it some more times). Each time the query hits the database.
This surprised me, as I expected it to be cached in the Hibernate's first level cache. After reading up about it a little bit more, I found out that the first level cache only caches the result of the findById query, not others.
Is there anyway that I can cache the result of the findByEmail query in the first level cache? (I don't want the cache to be shared between sessions, I don't want to use the 2nd level cache, as I think it should be invalidated right after the current session ends).
Yes, you can cache the results of a query on a unique property if you annotate the property with the #NaturalId annotation. If you then use the dedicated API to execute the query, the results will be stored in the 1st level cache. An example:
User user = entityManager
.unwrap(Session.class)
.bySimpleNaturalId(User.class)
.load("john#example.com");
I am looking to retrieve a large dataset with a JpaRepository, backed by Oracle
table. The choices are to return a collection (List) or a Page of the entity and then step through the results. Please note - I have to consume every record in this set, exactly once. This is not a "look-for-the-first-one-from-a-large-dataset-and-return" operation.
While the paging idea is appealing, the performance will be horrible (n^2) because for each page queried, oracle will have to pull up previous n-1 pages, making the performance progressively worse as I get deeper in the result set.
My understanding of the List alternative is that the entire result-set will be loaded in memory. For oracle JPA spring does not have a backing result-set.
So here are my questions
Is my understanding of the way List works with Spring Data correct? If it's not then I will just use List.
If I am correct, is there an alternative that streams Oracle/JPA result-sets?
Is there a third way that I am not aware of.
Pageable methods in SDJ call additional select count(*) from ... every request. I think this is reason of the problem.
To avoid it you can use Slice instead of Page as return parameter, for example:
Slice<User> getAllBy(Pageable pageable);
Or you can use even List of entities with pagination:
List<User> getAllBy(Pageable pageable);
Additional info
I migrate from OpenJPA 1.2 to Hiberante 4.0
I'm using TimesTen DB
I'm doing a native query to get id's of object's that I need , and then perform find on each on of them.
In OpenJPA instead of find I used findCache() method and if it return null I use the find() method , In hibernate I used only the find() method.
I performed this operation on the same DB.
after running couple of test I saw that the performance of OpenJPA is far better.
I printed the statistics of hibernate session ( after querying and finding the same object's) and saw that the hit\miss count to the first level cache is always 0.
while the OpenJPA is clearly reaching it's cache by fetching object's with the findCache method.
How can I improve the performance of find in Hibernate ?
I suspect it referred to the difference in the first level cache implementation of this tools.
another fact: I use the same EntityManager for the application run time ( I need to minimize the cost of creating of an EntityManager - my app is soft real time )
thanks.
Firstly, why don't you just retrieve the full objects instead of the id. One select statement to retrieve a number of objects is many magnitude times faster than retrieving each item individually.
Secondly, you likely need a second level cache for hibernate. The first level cache is mostly applicable within each session.
The first level cache in Hibernate corresponds to the session. So if the session has not yet loaded a given object, it will be a miss.
You need to enable second level cache to be able to cache an object by id across sessions.
Check out the reference documentation for more info http://docs.jboss.org/hibernate/orm/4.1/manual/en-US/html_single/#performance-cache
I would like to apply a Criteria query to an in-memory collection
of entities, instead of on the database. Is this possible?
To have Criteria API work like LINQ? Or alternatively, convert
Criteria query to LINQ query.
Thanks!
I don't believe you can use Criteria to query against an in-memory collection and come to think about it it doesn't seem to make much sense. If I'm understanding everything correctly you've already queried against your database. I'd suggest to either tune your original query (whichever method you choose) to include all of your filters. Or you could use LINQ (as you suggested) to refine your results.
Also, what's your reasoning for wanting to query from memory?
It sounds like you're rolling your own caching mechanism. I would highly recommend checking out NHibernate's 2nd level cache. It handles many complex scenarios gracefully such as invalidating query results on updates to the underlying tables.
http://ayende.com/Blog/archive/2009/04/24/nhibernate-2nd-level-cache.aspx
I use JBoss EJB 3.0 implementation (JBoss 4.2.3 server)
At the beginning I created native query all the time using construction like
Query query = entityManager.createNativeQuery("select * from _table_");
Of couse it is not that efficient, I performed some tests and found out that it really takes a lot of time... Then I found a better way to deal with it, to use annotation to define native queries:
#NamedNativeQuery( name = "fetchData", value = "select * from _table_", resultClass=Entity.class )
and then just use it
Query query = entityManager.createNamedQuery("fetchData");
the performance of code line above is two times better than where I started from, but still not that good as I expected... then I found that I can switch to Hibernate annotation for NamedNativeQuery (anyway, JBoss's implementation of EJB is based on Hibernate), and add one more thing:
#NamedNativeQuery( name = "fetchData2", value = "select * from _table_", resultClass=Entity.class, readOnly=true)
readOnly - marks whether the results are fetched in read-only mode or not. It sounds good, because at least in this case of mine I don't need to update data, I wanna just fetch it for report. When I started server to measure performance I noticed that query without readOnly=true (by default it is false) returns result with each iteration better and better, and at the same time another one (fetchData2) works like "stable" and with time difference between them is shorter and shorter, and after 5 iterations speed of both was almost the same...
The questions are:
1) is there any other way to speed query using up? Seems that named queries should be prepared once, but I can't say it... In fact if to create query once and then just use it it would be better from performance point of view, but it is problematic to cache this object, because after creating query I can set parameters (when I use ":variable" in query), and it changes query object (isn't it?). well, is here any way to cache them? Or named query is the best option I can use?
2) any other approaches how to make results retrieveng faster. I mean, for instance I don't need those Entities to be attached, I won't update them, all I need is just fetch collection of data. Maybe readOnly is the only available way, so I can't speed it up, but who knows :)
P.S. I don't ask about DB performance, all I need now is how not to create query object all the time, so use it efficient, and to "allow" EJB to do less job with the same result concerning data returning.
Added 15.03.2010:
By query I mean query object (so how to cache this object to reuse); and to cache query results is not a solution for me because of where cause in query can be almost unique for each querying because of float-pointing parameters there. Cache just will not understand that "a > 50.0001" and "a > 50.00101" can give the same result, but also can not.
You could use second level cache and query cache to avoid hitting the database (works especially well with read-only objects). Second level cache is supported by Hibernate (with a third party cache provider) but is an extension to JPA 1.0 though.