CacheLoader is not getting called while trying to find an entity using GemfireRepository - spring-boot

CacheLoader is not getting called while trying to find an entity using GemfireRepository.
As a solution, I am using Region<K,V> for looking up, which is calling CacheLoader. So wanted to know whether there is any restriction for Spring Data Repository which doesn't allow CacheLoader to be called when entry is not present in the cache.
And, is there any other alternative? Because I have one more scenario where my cache key is combination of id1 & id2 and I want to get all entries based on id1. And if there is no entry present in cache, then it will call CacheLoader to load all entries from Cassandra store.

There are no limitations nor restrictions in SDG when using the SD Repository abstraction (and SDG's Repository extension) that would prevent a CacheLoader from being invoked so long as the CacheLoader was properly registered on the target Region. Once control is handed over to GemFire/Geode to complete the data access operation (CRUD), it is out of SDG's hands.
However, you should know that GemFire/Geode only invokes CacheLoaders on gets (i.e. Region.get(key) operations), never on (OQL) queries. OQL queries are invoked from derived query methods or custom, user-defined query methods using #Query annotated methods declared in the application Repository interface.
NOTE: See Apache Geode CacheLoader Javadoc and User Guide for more details.
For a simple CrudRepository.findById(key) call, the call stack follows from...
SimplyGemfireRepository.findById(key)
GemfireTemplate.get(key)
And then, Region.get(key) (from here).
By way of example, and to illustrate this behavior, I added the o.s.d.g.repository.sample.RepositoryDataAccessOnRegionUsingCacheLoaderIntegrationTests to the SDG test suite as part of DATAGEODE-308. You can provide additional feedback in this JIRA ticket, if necessary.
Cheers!

Related

Context in transactions with Mongoid

I need to use mongoid's transactions to execute some operations while keeping consistency in case of failure.
Following the official documentation, I understand that I have to initiate a session on a model and execute the operations between start_transaction y commit_transaction.
The thing I do not understand is the fact that I have to instantiate a session on a specific model or instance of a model.
I do not get if it is because the model posses a helper to execute that operation (due to beign Monogid::Document) or maybe the operations I have to execute must be related to the model/instance used.
I mean, should I be able to execute this (I understand that is more or less wrong cause these model might be totally unrelated):
ModelA.with_session do |s|
s.start_transaction
TotallyUnrelatedModelA.create!
TotallyUnrelatedModelB.create!
TotallyUnrelatedModelC.create!
s.commit_transaction
end
Anyone know the reason?
Mongoid doesn't implement (or have) transactions at this time. That is a driver feature.
You shouldn't be calling commit_transaction as that is the first iteration of the transaction API exposed by the driver and doesn't support automatic retries. Mongoid documentation unfortunately hasn't yet been updated to show the correct API to use - it is the with_transaction method as described here.
To use a transaction on the driver level, the session that the transaction is started on must be passed to every operation manually, as shown in the same doc.
Mongoid doesn't have that requirement via what it calls a persistence context. This feature is somewhat described here, the gist of it is you can override where a model is read from or written to at runtime to e.g. write to another collection.
Sessions are implemented via this same runtime override. Review this page. The with_session method retrieves the client from the active persistence context, then ensures that 1) there is a session active on that client and 2) the active persistence context is associated with that session, so that 3) each persistence operation (read & write) would specify that session to the driver.
Now, to answer your question:
The thing I do not understand is the fact that I have to instantiate a session on a specific model or instance of a model.
Mongoid needs to know what client to start the session on. It can get that client from any persistence context. It doesn't matter if you use a model class or a model instance. Because only one session can be active at a time within Mongoid (the session is stored in thread-local storage for the current thread), you must use only models that are associated with the same client that you used for starting the session, via the with_session method, regardless of how that client is arrived at by Mongoid (be that via a model class or a model instance).

Batching stores transparently

We are using the following frameworks and versions:
jOOQ 3.11.1
Spring Boot 2.3.1.RELEASE
Spring 5.2.7.RELEASE
I have an issue where some of our business logic is divided into logical units that look as follows:
Request containing a user transaction is received
This request contains various information, such as the type of transaction, which products are part of this transaction, what kind of payments were done, etc.
These attributes are then stored individually in the database.
In code, this looks approximately as follows:
TransactionRecord transaction = transactionRepository.create();
transaction.create(creationCommand);`
In Transaction#create (which runs transactionally), something like the following occurs:
storeTransaction();
storePayments();
storeProducts();
// ... other relevant information
A given transaction can have many different types of products and attributes, all of which are stored. Many of these attributes result in UPDATE statements, while some may result in INSERT statements - it is difficult to fully know in advance.
For example, the storeProducts method looks approximately as follows:
products.forEach(product -> {
ProductRecord record = productRepository.findProductByX(...);
if (record == null) {
record = productRepository.create();
record.setX(...);
record.store();
} else {
// do something else
}
});
If the products are new, they are INSERTed. Otherwise, other calculations may take place. Depending on the size of the transaction, this single user transaction could obviously result in up to O(n) database calls/roundtrips, and even more depending on what other attributes are present. In transactions where a large number of attributes are present, this may result in upwards of hundreds of database calls for a single request (!). I would like to bring this down as close as possible to O(1) so as to have more predictable load on our database.
Naturally, batch and bulk inserts/updates come to mind here. What I would like to do is to batch all of these statements into a single batch using jOOQ, and execute after successful method invocation prior to commit. I have found several (SO Post, jOOQ API, jOOQ GitHub Feature Request) posts where this topic is implicitly mentioned, and one user groups post that seemed explicitly related to my issue.
Since I am using Spring together with jOOQ, I believe my ideal solution (preferably declarative) would look something like the following:
#Batched(100) // batch size as parameter, potentially
#Transactional
public void createTransaction(CreationCommand creationCommand) {
// all inserts/updates above are added to a batch and executed on successful invocation
}
For this to work, I imagine I'd need to manage a scoped (ThreadLocal/Transactional/Session scope) resource which can keep track of the current batch such that:
Prior to entering the method, an empty batch is created if the method is #Batched,
A custom DSLContext (perhaps extending DefaultDSLContext) that is made available via DI has a ThreadLocal flag which keeps track of whether any current statements should be batched or not, and if so
Intercept the calls and add them to the current batch instead of executing them immediatelly.
However, step 3 would necessitate having to rewrite a large portion of our code from the (IMO) relatively readable:
records.forEach(record -> {
record.setX(...);
// ...
record.store();
}
to:
userObjects.forEach(userObject -> {
dslContext.insertInto(...).values(userObject.getX(), ...).execute();
}
which would defeat the purpose of having this abstraction in the first place, since the second form can also be rewritten using DSLContext#batchStore or DSLContext#batchInsert. IMO however, batching and bulk insertion should not be up to the individual developer and should be able to be handled transparently at a higher level (e.g. by the framework).
I find the readability of the jOOQ API to be an amazing benefit of using it, however it seems that it does not lend itself (as far as I can tell) to interception/extension very well for cases such as these. Is it possible, with the jOOQ 3.11.1 (or even current) API, to get behaviour similar to the former with transparent batch/bulk handling? What would this entail?
EDIT:
One possible but extremely hacky solution that comes to mind for enabling transparent batching of stores would be something like the following:
Create a RecordListener and add it as a default to the Configuration whenever batching is enabled.
In RecordListener#storeStart, add the query to the current Transaction's batch (e.g. in a ThreadLocal<List>)
The AbstractRecord has a changed flag which is checked (org.jooq.impl.UpdatableRecordImpl#store0, org.jooq.impl.TableRecordImpl#addChangedValues) prior to storing. Resetting this (and saving it for later use) makes the store operation a no-op.
Lastly, upon successful method invocation but prior to commit:
Reset the changes flags of the respective records to the correct values
Invoke org.jooq.UpdatableRecord#store, this time without the RecordListener or while skipping the storeStart method (perhaps using another ThreadLocal flag to check whether batching has already been performed).
As far as I can tell, this approach should work, in theory. Obviously, it's extremely hacky and prone to breaking as the library internals may change at any time if the code depends on Reflection to work.
Does anyone know of a better way, using only the public jOOQ API?
jOOQ 3.14 solution
You've already discovered the relevant feature request #3419, which will solve this on the JDBC level starting from jOOQ 3.14. You can either use the BatchedConnection directly, wrapping your own connection to implement the below, or use this API:
ctx.batched(c -> {
// Make sure all records are attached to c, not ctx, e.g. by fetching from c.dsl()
records.forEach(record -> {
record.setX(...);
// ...
record.store();
}
});
jOOQ 3.13 and before solution
For the time being, until #3419 is implemented (it will be, in jOOQ 3.14), you can implement this yourself as a workaround. You'd have to proxy a JDBC Connection and PreparedStatement and ...
... intercept all:
Calls to Connection.prepareStatement(String), returning a cached proxy statement if the SQL string is the same as for the last prepared statement, or batch execute the last prepared statement and create a new one.
Calls to PreparedStatement.executeUpdate() and execute(), and replace those by calls to PreparedStatement.addBatch()
... delegate all:
Calls to other API, such as e.g. Connection.createStatement(), which should flush the above buffered batches, and then call the delegate API instead.
I wouldn't recommend hacking your way around jOOQ's RecordListener and other SPIs, I think that's the wrong abstraction level to buffer database interactions. Also, you will want to batch other statement types as well.
Do note that by default, jOOQ's UpdatableRecord tries to fetch generated identity values (see Settings.returnIdentityOnUpdatableRecord), which is something that prevents batching. Such store() calls must be executed immediately, because you might expect the identity value to be available.

Method caching with Spring boot and Hazelcast.How and where do I specify my refresh/reload intervals?

I realise #Cacheable annotation helps me with caching the result of a particular method call and subsequent calls are returned from the cache if there are no changes to arguments etc.
I have a requirement where I'm trying to minimise the number of calls to a db and hence loading the entire table. However,I would like to reload this data say every day just to ensure that my cache is not out of sync with the underlying data on the database.
How can I specify such reload/refresh intervals.
I'm trying to use Spring boot and hazelcast.All the examples I have seen talk about specifying LRU LFU etc policies on the config file for maps etc but nothing at a method level.
I can't go with the LRU/LFU etc eviction policies as I intend to reload the entire table data every x hrs or x days.
Kindly help or point me to any such implementation or docs etc.
Spring #Cacheable doesn't support this kind of policies at method level. See for example the code for CacheableOperation.
If you are using hazelcast as your cache provider for spring, you can explicitly evict elements or load datas by using the corresponding IMap from your HazelcastInstance.

spring ehcache integration with self-populating-cache-scope

I have to integrate spring and ehcache, and trying to implement it with blockingCache pattern
<ehcache:annotation-driven/>
there is one option for self-populating-cache-scope for shared (default) and method. could you please explain what is the difference?
There is also the annotation #Cacheable with selfPopulating flag
As per what I read on some post
http://groups.google.com/group/ehcache-spring-annotations/browse_thread/thread/7dbc71ce34f6ee19/b057610167dfb815?lnk=raot
it says when shared is used only one instance is created and the same is used everytime the same cache name is used so if I use the selfPopulating flag as true for one method,
all the threads trying to access other methods annotated with
#Cacheable with selfPopulating flag set to true will go on hold which
I dont want
<ehcache:annotation-driven/>
when self-populating-cache-scope = method on other hand creates separate instances for all methods annotated with #Cacheable with selfPopulating flag set to true so it doesn't create a problem.
But in this case when I try to remove a element using #TriggerRemove and giving the cache name used in #Cacheable will it search in each of those separate instances to find the value? Isnt this an overhead?
Answered by Eric on the ehcache google group above
In all cases there is one underlying Ehcache instance. What happens
when you set selfPopulating=true is a SelfPopulatingCache wrapper is
created.
If cache-scope=shared then all annotations using that named cache will
use the same SelfPopulatingCache wrapper If cache-scope=method then
one wrapper is created per method
Note in both cases the SelfPopulatingCache is a wrapper, there is
still only one actual cache backing the wrapper(s)
As for blocking, If you read the docs for SelfPopulatingCache and
BlockingCache you'll notice that ehcache does a compromise between
cache level locking and per-key locking via key striping.
http://ehcache.org/apidocs/net/sf/ehcache/constructs/blocking/BlockingCache.html

ASP.NET MVC - Repository pattern with Entity Framework

When you develop an ASP.NET application using the repository pattern, do each of your methods create a new entity container instance (context) with a using block for each method, or do you create a class-level/private instance of the container for use by any of the repository methods until the repository itself is disposed? Other than what I note below, what are the advantages/disadvantages? Is there a way to combine the benefits of each of these that I'm just not seeing? Does your repository implement IDisposable, allowing you to create using blocks for instances of your repo?
Multiple containers (vs. single)
Advantages:
Preventing connections from being auto-closed/disposed (will be closed at the end of the using block).
Helps force you to only pull into memory what you need for a particular view/viewmodel, and in less round-trips (you will get a connection error for anything you attempt to lazy load).
Disadvantages:
Access of child entities within the Controller/View is limited to what you called with Include()
For pages like a dashboard index that shows information gathered from many tables (many different repository method calls), we will add the overhead of creating and disposing many entity containers.
If you are instantiating your context in your repository, then you should always do it locally, and wrap it in a using statement.
If you're using Dependency Injection to inject the context, then let your DI container handle calling dispose on the context when the request is done.
Don't instantiate your context directly as a class member, since this will not dispose of the contexts resources until garbage collection occurs. If you do, then you will need to implement IDipsosable to dispose the context, and make sure that whatever is using your repository properly disposes of your repository.
I, personally, put my context at the class level in my repository. My primary reason for doing so is because a distinct advantage of the repository pattern is that I can easily swap repositories and take advantage of a different backend. Remember - the purpose of the repository pattern is that you provide an interface that provides back data to some client. If you ever switch your data source, or just want to provide a new data source on the fly (via dependency injection), you've created a much more difficult problem if you do this on a per-method level.
Microsoft's MSDN site has good information the repository pattern. Hopefully this helps clarify some things.
I disagree with all four points:
Preventing connections from being auto-closed/disposed (will be closed
at the end of the using block).
In my opinion it doesn't matter if you dispose the context on method level, repository instance level or request level. (You have to dispose the context of course at the end of a single request - either by wrapping the repository method in a using statement or by implementing IDisposable on the repository class (as you proposed) and wrapping the repository instance in a using statement in the controller action or by instantiating the repository in the controller constructor and dispose it in the Dispose override of the controller class - or by instantiating the context when the request begins and diposing it when the request ends (some Dependency Injection containers will help to do this work).) Why should the context be "auto-disposed"? In desktop application it is possible and common to have a context per window/view which might be open for hours.
Helps force you to only pull into memory what you need for a
particular view/viewmodel, and in less round-trips (you will get a
connection error for anything you attempt to lazy load).
Honestly I would enforce this by disabling lazy loading altogether. I don't see any benefit of lazy loading in a web application where the client is disconnected from the server anyway. In your controller actions you always know what you need to load and can use eager or explicit loading. To avoid memory overhead and improve performance, you can always disable change tracking for GET requests because EF can't track changes on a client's web page anyway.
Access of child entities within the Controller/View is limited to what
you called with Include()
Which is rather an advantage than a disadvantage because you don't have the unwished surprises of lazy loading. If you need to populate child entities later in the controller actions, depending on some condition, you could load them through additional repository methods (LoadNavigationProperty or something) with the same or even a new context.
For pages like a dashboard index that shows information gathered from
many tables (many different repository method calls), we will add the
overhead of creating and disposing many entity containers.
Creating contexts - and I don't think we are talking about hundreds or thousands of instances - is a cheap operation. I would call this a very theoretical overhead which doesn't play a role in practice.
I've used both approaches you mentioned in web applications and also the third option, namely to create a single context per request and inject this same context into every repository/service I need in a controller action. They all three worked for me.
Of course if you use multiple contexts you have to be careful to do all the work in the same unit of work to avoid attaching entities to multiple contexts which will lead to well know exceptions. It's usually not a problem to avoid this situations but requires a bit more attention, especially when processing POST requests.
I lately use contexts per request, because it is easier and I just don't see the benefit of having very narrow contexts and I see no reason to use more than one single unit of work for the whole request processing. If I would need multiple contexts - for whatever reason - I could always create specialized methods which act with their own context instead of the "default context" of the request.

Resources