How to correctly use Spring Data Repository#save()? - spring

In Spring Data Repository interfaces, the following operation is defined:
public T save(T entity);
... and the documentation states that the application should continue working with the returned entity.
I know about the reasoning behind this decision, and it makes sense. I can also see that this works perfectly fine for simple models with independent entities. But given a more complex JPA model with lots of #OneToMany and #ManyToMany connections, the following question arises:
How is the application supposed to use the returned object, when all the rest of the loaded model still references the old one that was passed into save(...)? Also, there might be collections in the application that still contain the old entity. The JVM does not allow to globally "swap" the unsaved entity with the saved one.
So what is the correct usage pattern? Any best practices? I only encountered toy examples so far that do not use #OneToMany or #ManyToMany and thus don't run into this issue. I'm sure that a lot of smart people thought long and hard about this, but I can't see how to use this properly.

This is covered in section 3.2.7.1 of the JPA specification that describes how merge should work. In a nutshell, if the instance being saved is managed (existing), it is simply saved in-place. If not, it is copied to a managed instance (which may not necessarily be a different object since the spec does not mandate that a new instance must be created in this case) and all references from the instance being saved to other managed entities are also updated to refer to the managed instance. This of course requires that the relationships have been correctly defined from the entity being saved.
Indeed, this does not cover the case of storing an entity instance in an unmanaged collection (such as a static collection). That is anyways not advisable because a persisted entity must always be loaded through the persistence provider mechanism (who knows the entity instance may have changed in the persistent store).
Since I have been using JPA for the past many years and have never faced problems, I am confident that the section I have referenced above works well in all scenarios (subject to the JPA provider implementing it as intended). You should try some of the cases that worry you and post separate questions if you run into problems.

Related

Will Spring Data's save() method update an entity in the database if there are no changes in the entity?

When editing a form, the user may sometimes not change the form and still click the submit button. In one of the controller methods below, will the save() method perform a query to the database and update the fields even if the user didn't change anything?
PostMapping("/edit_entry/{entryId}")
public String update_entry(
#PathVariable("entryId") Long entryId,
#RequestParam String title,
#RequestParam String text
) {
Entry entry = this.entryRepo.findById(entryId).get();
if (!entry.getTitle().equals(title))
entry.setTitle(title);
if (!entry.getText().equals(text))
entry.setText(text);
this.entryRepo.save(entry);
return "redirect:/entries";
}
And also, are the "if" statements necessary in this case?
What exactly happens during a call to save(…) depends on the underling persistence technology. Fundamentally there a re two categories of implementations:
Implementations that actively manage entities. Examples of this are JPA and Neo4j. Those implementations keep track of the entities returned from the store and thus are able to detect changes in the first place. You pay for this with additional complexity as the entities are usually instrumented in some way and the change detection of course also takes time even if it ends up not detecting any changes. On the upside though the only trigger updates if needed.
Implementations that do not actively manage entities. Examples are JDBC and MongoDB. Those implementations do not keep track of entities loaded from the data store and thus do not instrument them. That also means that there is no way of detecting changes as all the implementation sees is an entity instance without any further context.
In your concrete example, a MongoDB implementation would still issue an update while JPA will not issue an update at all if the request params do not contain differing values.

Streamlining the implementation of a Repository Pattern and SOA

I'm working with Laravel 5 but I think this question can be applied beyond the scope of a single framework or language. The last few days I've been all about writting interfaces and implementations for repositories, and then binding services to the IoC and all that stuff. It feels extremely slow.
If I need a new method in my service, say, Store::getReviews() I must create the relationship in my entity model class (data source, in this case Eloquent) then I must declare the method in the repo interface to make it required for any other implementation, then I must write the actual method in the repo implementation, then I have to create another method on the service that calls on the repo to extract all reviews for the store... (intentional run-on sentence) It feels like too much.
Creating a new model now isn't as simple as extending a base model class anymore. There are so many files I have to write and keep track of. Sometimes I'll get confused as of to where exactly I should put something, or find halfway throught setting up a method that I'm in the wrong class. I also lost Eloquent's query building in the service. Everytime I need something that Eloquent has, I have to implement it in the repo and the service.
The idea behind this architecture is awesome but the actual implementation I am finding extremely tedious. Is there a better, faster way to do things? I feel I'm beeing too messy, even though I put common methods and stuff in abstract classes. There's just too much to write.
I've wrestled with all this stuff as I moved to Laravel 5. That's when I decided to change my approach (it was tough decision). During this process I've come to the following conclusions:
I've decided to drop Eloquent (and the Active Record pattern). I don't even use the query builder. I do use the DB fascade still, as it's handy for things like parameterized query binding, transactions, logging, etc. Developers should know SQL, and if they are required to know it, then why force another layer of abstraction on them (a layer that cannot replace SQL fully or efficiently). And remember, the bridge from the OOP world to the Relational Database world is never going to be pretty. Bear with me, keeping reading...
Because of #1, I switched to Lumen where Eloquent is turned off by default. It's fast, lean, and still does everything I needed and loved in Laravel.
Each query fits in one of two categories (I suppose this is a form of CQRS):
3.1. Repositories (commands): These deal with changing state (writes) and situations where you need to hydrate an object and apply some rules before changing state (sometimes you have to do some reads to make a write) (also sometimes you do bulk writes and hydration may not be efficient, so just create repository methods that do this too). So I have a folder called "Domain" (for Domain Driven Design) and inside are more folders each representing how I think of my business domain. With each entity I have a paired repository. An entity here is a class that is like what others may call a "model", it holds properties and has methods that help me keep the properties valid or do work on them that will be eventually persisted in the repository. The repository is a class with a bunch of methods that represent all the types of querying I need to do that relates to that entity (ie. $repo->save()). The methods may accept a few parameters (to allow for a bit of dynamic query action inside, but not too much) and inside you'll find the raw queries and some code to hydrate the entities. You'll find that repositories typically accept and/or return entities.
3.2. Queries (a.k.a. screens?): I have a folder called "Queries" where I have different classes of methods that inside have raw queries to perform display work. The classes kind of just help for grouping together things but aren't the same as Repositories (ie. they don't do hydrating, writes, return entities, etc.). The goal is to use these for reads and most display purposes.
Don't interface so unnecessarily. Interfaces are good for polymorphic situations where you need them. Situations where you know you will be switching between multiple implementations. They are unneeded extra work when you are working 1:1. Plus, it's easy to take a class and turn it into an interface later. You never want to over optimize prematurely.
Because of #4, you don't need lots of service providers. I think it would be overkill to have a service provider for all my repositories.
If the almost mythological time comes when you want to switch out database engines, then all you have to do is go to two places. The two places mentioned in #3 above. You replace the raw queries inside. This is good, since you have a list of all the persistence methods your app needs. You can tailor each raw query inside those methods to work with the new data-store in the unique way that data-store calls for. The method stays the same but the internal querying gets changed. It is important to remember that the work needed to change out a database will obviously grow as your app grows but the complexity in your app has to go somewhere. Each raw query represents complexity. But you've encapsulated these raw queries, so you've done the best to shield the rest of your app!
I'm successfully using this approach inspired by DDD concepts. Once you are utilizing the repository approach then there is little need to use Eloquent IMHO. And I find I'm not writing extra stuff (as you mention in your question), all while still keeping my app flexible for future changes. Here is another approach from a fellow Artisan (although I don't necessarily agree with using Doctrine ORM). Good Luck and Happy Coding!
Laravel's Eloquent is an Active Record, this technology demands a lot of processing. Domain entities are understood as plain objects, for that purpose try to utilizes Doctrime ORM. I built a facilitator for use Lumen and doctrine ORM follow the link.
https://github.com/davists/Lumen-Doctrine-DDD-Generator
*for acurated perfomance analisys there is cachegrind.
http://kcachegrind.sourceforge.net/html/Home.html

Basic Entity Framework Questions

I have an existing database, which I have been happily accessing using LINQtoSQL. Armed with Sanderson's MVC3 book I thought I'd have a crack at EF4.3, but am really fighting to get even basic functionality working.
Working with SQL 2008, VS2010, the folder architecture appears to be:
ABC.Domain.Abstract
ABC.Domain.Concrete
ABC.Domain.Concrete.ORM
ABC.Domain.Entities
Per examples, repository interfaces are abstract, actual repositories are concrete. Creating EDMX from the existing database puts that in the ORM folder and the Entities holds the classes I designed as part of the domain. So far so good.
However! I have not once persuaded the deceptively simple EfDbContext : DbContext class, with method to work...
public DbSet<ABC.Domain.Entities.Person> Person { get { return _context.Persons; }}
It complains about missing keys, that Person is not a entity class, that it cannot find the conceptual model, and so on.
Considering I have a basic connectionstring in the web.config, why is not creating a model on the fly to do simple matching?
Should the ORM folder exist, or should it simply be Concrete? (I have a .SQL subfolder for LINQtoSQL concret so it suits me to have .ORM but if it's a flaw, let's fix it).
Should I have my homespun entities AND the automatically produced ones or just one set?
The automatic ones inherit from EntityObject, mine are just POCO or POCO with complexTypes, but do not inherit from anything.
What ties the home designed Domain.Entities.Person type to the Persons property of the Context?
Sanderson's book implies that the matching is implicit if properties are identical, which they are, but that does not do it.
The app.config has an EF flavoured connection string in it, the web.config has a normal connection string in it. Which should I be using - assuming web.config at the moment - so do I delete app.config?
Your help is appreciated. Long time spent, no progress for some days now.
What ties the home designed Domain.Entities.Person type to the Persons
property of the Context?
You seem to have a misunderstanding here. Your domain entities are the entities for the database. There aren't two sets. If you actually want to have two sets of object classes (for whatever reason) you must write any mapping between the two manually. EF only knows about the classes which are part of the entity model.
You should also - if you are using EF 4.3 - apply the DbContext Generator T4 template to the EDMX file. Do not work with EntityObject derived entities! It is not supported with DbContext. The generator will build a set of POCO classes and prepare a derived DbContext. This set of POCO classes are the entities the DbContext will only know about and they should be your only set of domain entities.
The created DbContext will contain simple DbSet properties with automatic getters and setters...
public DbSet<Person> People { get; set; }
...and the Person class will be created as POCO as well.
Download the entity framework power tools:
http://visualstudiogallery.msdn.microsoft.com/72a60b14-1581-4b9b-89f2-846072eff19d
Right click in your project to 'reverse engineer an existing database' it will create the code classes for you. No need to use EDMX ,and this method will create the DbContext derived class for you
There are many questions here and you won't get an answer, but I'll stick my 5 pence for what it's worth.
Sanderson's MVC3 book
Your problems are not to do with MVC3, they are to do with Entity Framework and data persistence layer.
ABC.Domain.Abstract ABC.Domain.Concrete ABC.Domain.Concrete.ORM
ABC.Domain.Entities
Can you say why this is separated in such a way? I would argue and say that ABC.Domain should contain your POCOs independent of your persistence layer (EF) and your presentation layer (MVC). Your list implies that your domain contains ORM and your data access entities. I'm not arguing here, what I'm trying to say, is that you need to understand what you really need.
At the end of a day, I'm certain that a simple example would suffice with ABC.DataAccess, ABC.Domain and ABC.Site.
Do you understand why repositories are abstract and concrete? If you don't, then leave out interfaces and see whether you can improve it with interfaces later.
Person is not a entity class, that it cannot find the conceptual
model, and so on.
Now, there are multiple ways you can get EF to persist data for you. You can use code first, where, as the name implies, you will write code first, and EF will generate database, relations and all the relevant constraints for you.
You can use database first, where EF will generate relevant class and data access related objects from your database. This is less preferable method for me, as it relies heavily upon your database structure.
You can use model first, where you will design your class in EDMX designer and it will then generate relevant SQL for you.
All of these might sound like a bit of black box, but for what you are trying to achieve all of them will work. EDMX is a good way to learn and there are many step by step tutorials on ASP.Net.
but if it's a flaw, let's fix it).
You will have to fix and refactor yourself, there is no other way to improve in my honest opinion. I can give you a different folder/namespace structure, but there will always be a "better" one.
Should I have my homespun entities AND the automatically produced ones
or just one set?
Now this depends on the model that you have chosen. Database first, code first, code only and whatever else is there. If you are following domain driven development, then you will have to work with classes, that represent your business logic and that are not tied up to your data persistence layer or presentation layers, therefore POCO is a way forward.
What ties the home designed Domain.Entities.Person type to the Persons
Now this again depends on the model that you are using.
The app.config and web.config
When you are running your web application, the connection string from web application will be used. Please correct me if I'm wrong.
Your help is appreciated. Long time spent, no progress for some days
now.
General advise, leave MVC alone for the time being. Get it to work in a console application and make sure you feel comfortable with options offered in EF. Good luck :)
The solution to why nothing worked code-first...
...turned out to be a reference to System.Data.EntityClient in the connection string, which ought to have read System.Data.SqlClient.
Without this provider entry being correct, it was unable to work code-first.
Finding which connectionString it was using was a case of deliberately mis-spelling a keyword in the connections there were to choose from - they all were named correctly - but were in app.config, and 2 places in the web.config. With a distinct naming error, when the application threw an error trying to create the domain model, it was easy to identify which connection string my derived DbContext class was using. Correcting the ProviderName made all the difference.
Code-first is now working just fine, with seeded values on model changes.

Trying to improve efficiency of MVC3 + Unity project

I have an MVC3 project that uses Unity for dependency injection.
There is a main MVC3 project, a “domain” class library that sits between MVC3 and the data tier, and a bunch of class libraries that make up the data tier.
(MVC3) – (domain) – (data tier)
This is an example of one of the service constructors in the domain class:
public DomainModelCacheServices(
Data.Interface.ICountryRepository countryRepository,
Data.Interface.ILanguageRepository languageRepository,
Data.Interface.ISocialNetRepository socialNetRepository
)
Every time a controller is called that has DomainModelCacheServices in its constructor, a new DomainModelCacheServices object is constructed, plus the three repository classes in the constructor of DomainModelCacheServices.
I cannot believe this is efficient!
What makes this worse is that the class DomainModelCacheServices is a cache class. It loads lists of data that never change, and holds them as statics. But it still needs to construct three repository classes for every reference!
If I give DomainModelCacheServices the lifetime of a singleton (forever), I have to ensure it is thread-safe, and if the day comes when I am getting hundreds of hits, there’s going to be a lot of locking.
I could change the constructor to this:
public DomainModelCacheServices(
IServiceLocator serviceLocator
)
I don’t know why, but this doesn’t look right. The constructor becomes meaningless to the eye, and I have to reference Unity in the domain class and somehow make the domain class aware of the ServiceLocator owned by the MVC3 application. Maybe the loose-coupling can be too loose?
Maybe constructing all these classes is not as inefficient as it looks I shouldn’t worry about it?
What would be nice is if Unity supported “Lazy” constructor parameters. But it doesn’t.
So, any ideas on how to make an MVC3 + Unity project more efficient, specifically in the domain model design?
Thanks for reading!
The cache shouldn't be definied on the domain level but on the repositories implemntation level (so in DAL). So for example ICountryRepository should have two implementations in DAL : CountryRepository and ChachedCountryRepository. These should be wired as decorators in Unity (CountryRepository is inside the ChachedCountryRepository). CachedCountryRepository would check if the data is in the cache and if not it would pass the call to the inner CountryRepository.
Creating objects is not expensive and wouldn't care too much about issues as a caching is correctly definied.
Great reasoning.
However, creating objects is cheap. I would not create a singleton since you already are caching all objects in static fields. The current approach is easy to understand.
I got another question for you:
Why are you not caching in your repository classes?
The repositories are responsible for the data and all data handling should be transparent to everything else. It also makes everything easier since they are responsible of updating the data sources. How do you keep the cache in sync with changes today? Through domain events?
I would create a cache class which I would use as a private field in the repository.

Static? Repositories MVC3, EF4.2 (Code First)

I'm new to MVC, EF and the like so I followed the MVC3 tutorial at http://www.asp.net/mvc and set up an application (not yet finished with everything though).
Here's the "architecture" of my application so far
GenericRepository
PropertyRepository inherits GenericRepository for "Property" Entity
HomeController which has the PropertyRepository as Member.
Example:
public class HomeController
{
private readonly PropertyRepository _propertyRepository
= new PropertyRepository(new ConfigurationDbContext());
}
Now let's consider the following:
I have a Method in my GenericRepository that takes quite some time, invoking 6 queries which need to be in one transaction in order to maintain integrity. My google results yeldet that SaveChanges() is considered as one transaction - so if I make multiple changes to my context and then call SaveChanges() I can be "sure" that these changes are "atomic" on the SQL Server. Right? Wrong?
Furthermore, there's is an action method that calls _propertyRepository.InvokeLongAndComplex() Method.
I just found out: MVC creates a new controller for each request. So I end up with multiple PropertyRepositories which mess up my Database Integrity. (I have to maintain a linked list of my properties in the database, and if a user moves a property it needs 6 steps to change the list accordingly but that way I avoid looping through all entities when having thousands...)
I thougth about making my GenericRepository and my PropertyRepository static, so every HomeController is using the same Repository and synchronize the InvokeLongAndComplex Method to make sure there's only one Thread making changes to the DB at a time.
I have the suspicion that this idea is not a good solution but I fail to find a suitable solution for this problem - some guys say that's okay to have static repositories (what happens with the context though?). Some other guys say use IOC/DI (?), which sounds like a lot of work to set up (not even sure if that solves my problem...) but it seems that I could "tell" the container to always "inject" the same context object, the same Repository and then it would be enough to synchronize the InvokeLongAndComplex() method to not let multiple threads mess up the integrity.
Why aren't data repositories static?
Answer 2:
2) You often want to have 1 repository instance per-request to make it easier to ensure that uncommited changes from one user don't mess things up for another user.
why have a repository instance per-request doesn't it mess up my linked list again?
Can anyone give me an advice or share a best practice which I can follow?
No! You must have a new context for each request so even if you make your repositories static you will have to pass current context instance to each its method instead of maintaining single context inside repository.
What you mean by integrity in the first place? Are you dealing with transactions, concurrency issues or referential constraints? Handling all of these issues is your responsibility. EF will provide some basic infrastructure for that but the final solution is still up to your implementation.

Resources