This is what is want to achieve using MyBatis cache(or combining with ehCache/others):
-load the entire result set for an aggregate query into cache
-ability to query this result set and apply sql based filter(between start and end dates)
I searched around the web but could not find an answer to this. Please help.
Suggestions welcome.
Ehcache has search API, you can load entries to cache and afterwords search it based on whatever criteria you like, including dates.
Of course this means implementing caching mechanism yourself, maybe by extending EhcacheCache or since you're using Spring maybe extending AbstractCacheManager or EhCacheCacheManager could be an option.
But there should be performance conciderations, since cache is mean not for querying, but for caching, especially standalone Ehcache version.
Related
I have two caches with different types inside
Would like to do a paged query on both of them. So would like to pass in a sort/filter values and get content from both caches?
Is there a way how to do it without manually writing the merge and pagination?
Currently I can only do something like this:
val queryFactory = Search.getQueryFactory(cache)
queryFactory.from(Class.getClass)
or
val searchManager = Search.getSearchManager(cache)
searchManager.buildQueryBuilderForClass(Class.getClass).get()
Searching across multiple caches is not supported and there are no concrete plans to support it. Neither the query DSL nor the direct Lucene API allow it. The workaround is to merge the search results yourself.
The main reason for this is each cache has its own separate set of indexes. So a search across caches would have to retrieve data from multiple indexes and perform a merge which is not efficient in current implementation, so this was left out for technical reasons for now.
I'm using spring boot and for easier setup(no user right manipulation) I decided to use RAM provider instead of FS. Can anyone confirm my way of thoughts.
Whenever I'm restarting I'm loosing the index.
Anytime something goes through Hibernate it will be auto-indexed as there are #Indexed annotations on proper entities.
In case of restart I need to rebuild the index as it is lost using
try {
FullTextEntityManager fullTextEntityManager =
Search.getFullTextEntityManager(entityManager);
fullTextEntityManager.createIndexer().startAndWait();
} catch (InterruptedException e) {
System.out.println(
"An error occurred trying to build the search index: " +
e.toString());
}
In case that I will use FSDirectoryProvider, index will be automatically reloaded from FS and above code is not necessary anymore. Unless, there is change in ORM entities. I guess then I will need somehow to manually force the re-indexing.
Is there some DBDirectory implementation that one can depend on? In this case the index file is loaded to RAM or each update to index is written to DB separately?
All of your assumptions are nearly correct.
An in-memory index is lost under two circumstances:
JVM shutdown
index gets reopened while application is still running
You need to reindex your entities only if:
you changed the way how your entities are analyzed or tokenized during indexing or searching
you added or removed entity properties from or to the index
you changed relations between entities that affects your index
At the time of writting there is no database based directory. In the past I tried to adapt Compass JdbcDirectory. Unfortunately, I never had the time to go further than with a working proof of concept.
There is an open issues since 2011 in the project tracker. It seems that in near future there won't be official support for a database driven directory in Hibernate Search.
Keep in mind that an in-memory index is only sufficient for small data:
Warning: This class is not intended to work with huge indexes.
Everything beyond several hundred megabytes will waste resources (GC
cycles), because it uses an internal buffer size of 1024 bytes,
producing millions of byte[1024] arrays. This class is optimized for
small memory-resident indexes. It also has bad concurrency on
multithreaded environments.
You can use the Infinispan Directory as an alternative to keep stuff in memory but have a replica on durable storage.
The Infinispan project provides both a
an Apache Lucene Directory implementation
an Hibernate Search DirectoryProvider
A pointer to the source code
Infinispan is meant to aggressively cache data in memory, but has several options to offload such data to permanent storage by enabling a CacheStore in its configuration.
Among the many CacheStore implementations, you might be interested in:
the FSCacheStore which stores stuff in filesystem
the JDBC based CacheStore is often a good combo for Hibernate.
There are many more alternatives, like connecting to cloud storage, popular NoSQL databases, etc.. Infinispan also supports real-time replication across nodes, so your options for index storage in Hibernate Search are pretty much limitless.
We are using the Spring Data JPA for database access. Our repositories contain basic query methods. What we want to do now is to use the Specification-Interface (criteria API) combined with complex query methods (like findByName(Specification spec)). The problem is that these two ways block each other out (since there are two where queries now). Is there any way to do this, like telling JPA to combine the two where parts with AND? The reason we want to do this is because some parts of the where query are essential for every query. They should be defined in the name of the query method. The Specification only should contain individual criterias for individual use-cases.
Or is there any other way to solve this?
Currently this is not supported. Please feel free to raise a JIRA issue if you think this would be a worthwhile enhancement.
I would like to apply a Criteria query to an in-memory collection
of entities, instead of on the database. Is this possible?
To have Criteria API work like LINQ? Or alternatively, convert
Criteria query to LINQ query.
Thanks!
I don't believe you can use Criteria to query against an in-memory collection and come to think about it it doesn't seem to make much sense. If I'm understanding everything correctly you've already queried against your database. I'd suggest to either tune your original query (whichever method you choose) to include all of your filters. Or you could use LINQ (as you suggested) to refine your results.
Also, what's your reasoning for wanting to query from memory?
It sounds like you're rolling your own caching mechanism. I would highly recommend checking out NHibernate's 2nd level cache. It handles many complex scenarios gracefully such as invalidating query results on updates to the underlying tables.
http://ayende.com/Blog/archive/2009/04/24/nhibernate-2nd-level-cache.aspx
I use JBoss EJB 3.0 implementation (JBoss 4.2.3 server)
At the beginning I created native query all the time using construction like
Query query = entityManager.createNativeQuery("select * from _table_");
Of couse it is not that efficient, I performed some tests and found out that it really takes a lot of time... Then I found a better way to deal with it, to use annotation to define native queries:
#NamedNativeQuery( name = "fetchData", value = "select * from _table_", resultClass=Entity.class )
and then just use it
Query query = entityManager.createNamedQuery("fetchData");
the performance of code line above is two times better than where I started from, but still not that good as I expected... then I found that I can switch to Hibernate annotation for NamedNativeQuery (anyway, JBoss's implementation of EJB is based on Hibernate), and add one more thing:
#NamedNativeQuery( name = "fetchData2", value = "select * from _table_", resultClass=Entity.class, readOnly=true)
readOnly - marks whether the results are fetched in read-only mode or not. It sounds good, because at least in this case of mine I don't need to update data, I wanna just fetch it for report. When I started server to measure performance I noticed that query without readOnly=true (by default it is false) returns result with each iteration better and better, and at the same time another one (fetchData2) works like "stable" and with time difference between them is shorter and shorter, and after 5 iterations speed of both was almost the same...
The questions are:
1) is there any other way to speed query using up? Seems that named queries should be prepared once, but I can't say it... In fact if to create query once and then just use it it would be better from performance point of view, but it is problematic to cache this object, because after creating query I can set parameters (when I use ":variable" in query), and it changes query object (isn't it?). well, is here any way to cache them? Or named query is the best option I can use?
2) any other approaches how to make results retrieveng faster. I mean, for instance I don't need those Entities to be attached, I won't update them, all I need is just fetch collection of data. Maybe readOnly is the only available way, so I can't speed it up, but who knows :)
P.S. I don't ask about DB performance, all I need now is how not to create query object all the time, so use it efficient, and to "allow" EJB to do less job with the same result concerning data returning.
Added 15.03.2010:
By query I mean query object (so how to cache this object to reuse); and to cache query results is not a solution for me because of where cause in query can be almost unique for each querying because of float-pointing parameters there. Cache just will not understand that "a > 50.0001" and "a > 50.00101" can give the same result, but also can not.
You could use second level cache and query cache to avoid hitting the database (works especially well with read-only objects). Second level cache is supported by Hibernate (with a third party cache provider) but is an extension to JPA 1.0 though.