Fetch annotation in SDG 2.0, fetching strategy questions - spring

Hi all patient developers using spring data graph. Since there is so less documentation and pretty poor test coverage it is sometimes very difficult to understand what is the expected behavior of the underlying framework how the framework is supposed to work. Currently i have some questions related to new fetching approach introduced in SDG 1.1. As opposite to SDG 1.1 write\read through in 2.0 only relations and related object annotated with #Fetch annotation are eagerly fetched others are supposed to be fetched lazily .. and now my first question:
Is it possible to configure SDG so that if the loading of entity and
invoking getter on lazy relation takes place in the same transaction,
requested collection is fetch automatically? Kind of Persistence
Context in transaction scope, or maybe it is planned for the feature
releases.
How can I fetch lazy collection at once for #RelatedTo annotation ? fetch() method on from Neo4jOperation allows to fetch only one entity. Do i have to iterate through whole list and fetch entity for each object? What would be the best way to check if given object is already fetched / initialized or not?
As suggestion i think it would be more intuitive if there will be kind of lazy loading exception thrown instead of getting NPE when working with not initialized objects. Moreover the behavior is misleading since when object is not initialized and all member properties are null apart from id, equals method can provide true for different objects which has not been initialized, which is quite serious issues considering for example appliance of sets
Another issue which i noticed when working with SDG 2.0.0.RC1 is following: when i add new object to not fetched collection sometimes is properly added and persisted,however sometimes is not. I wrote test for this case and it works in non deterministic way. Sometimes it fails sometimes end with success. Here is the use case:
Group groupFromDb = neoTemplate.findOne(group.getId(), Group.class);
assertNotNull(groupFromDb);
assertEquals("Number of members must be equals to 1", 1, groupFromDb.getMembers().size());
User secondMember = UserMappingTest.createUser("secondMember");
groupFromDb.addMember(secondMember);
neoTemplate.save(groupFromDb);
Group groupAfterChange = neoTemplate.findOne(groupFromDb.getId(), Group.class);
assertNotNull(groupAfterChange);
assertEquals("Number of members must be equals to saved entity", groupFromDb.getMembers().size(), groupAfterChange.getMembers().size());
assertEquals("Number of members must be equals to 2", 2, groupAfterChange.getMembers().size());
This test fails sometimes on the last assert, which would mean that sometimes member is added to the set and sometimes not. I guess that the problem lies somewhere in the ManagedFieldAccessorSet, but it is difficult to say since this is non deterministic. I run the test with mvn2 and mvn3 with java 1.6_22 and 1.6_27 and i got always the same result: sometimes is Ok sometimes test fails. Implementation of User equals seems as follows:
#Override
public boolean equals(final Object other) {
if ( !(other instanceof User) ) {
return false;
}
User castOther = (User) other;
if(castOther.getId() == this.getId()) {
return true;
}
return new EqualsBuilder().append(username, castOther.username).isEquals();
}
- I find it also a bit problematic that for objects annotated with #Fetch java HashSet is used which is serializable, while using for lazy loaded fields ManagedFieldAccessorSet is used which is not serializable and causes not serializable exception.
Any help or advice are welcome. Thanks in advance!

I put together a quick code sample showing how to use the fetch() technique Michael describes:
http://springinpractice.com/2011/12/28/initializing-lazy-loaded-collections-with-spring-data-neo4j/

The simple mapping approach was only added to Spring Data Neo4j 2.0, so it is not as mature as the advanced AspectJ mapping. We're currently working on documenting it more extensively.
The lazy loading option was also added lately. So your feedback is very welcome.
Right now SDN doesn't employ a proxy approach for the lazily loaded objects. So the automatic "fetch on access" is not (yet) supported. That's why also no exception is thrown when accessing non-loaded fields and there is no means of "discovering" if an entity was not fully loaded.
In the current snapshot there is the template.fetch() operation to fully load lazy loaded objects and collections.
We'll look into the HashSet vs. ManagedSet issue, it is correct that this is not a good solution.
For the test-case. is the getId() returning a Long object or a long primitive? It might be sensible to use getId().equals(castOther.getId()) here as reference equality is not guaranteed for Number objects.

Related

Will Spring Data's save() method update an entity in the database if there are no changes in the entity?

When editing a form, the user may sometimes not change the form and still click the submit button. In one of the controller methods below, will the save() method perform a query to the database and update the fields even if the user didn't change anything?
PostMapping("/edit_entry/{entryId}")
public String update_entry(
#PathVariable("entryId") Long entryId,
#RequestParam String title,
#RequestParam String text
) {
Entry entry = this.entryRepo.findById(entryId).get();
if (!entry.getTitle().equals(title))
entry.setTitle(title);
if (!entry.getText().equals(text))
entry.setText(text);
this.entryRepo.save(entry);
return "redirect:/entries";
}
And also, are the "if" statements necessary in this case?
What exactly happens during a call to save(…) depends on the underling persistence technology. Fundamentally there a re two categories of implementations:
Implementations that actively manage entities. Examples of this are JPA and Neo4j. Those implementations keep track of the entities returned from the store and thus are able to detect changes in the first place. You pay for this with additional complexity as the entities are usually instrumented in some way and the change detection of course also takes time even if it ends up not detecting any changes. On the upside though the only trigger updates if needed.
Implementations that do not actively manage entities. Examples are JDBC and MongoDB. Those implementations do not keep track of entities loaded from the data store and thus do not instrument them. That also means that there is no way of detecting changes as all the implementation sees is an entity instance without any further context.
In your concrete example, a MongoDB implementation would still issue an update while JPA will not issue an update at all if the request params do not contain differing values.

Read null as empty set in springdata-cassandra

I use spring-data-cassandra, and have entity like this:
#Table("users")
public class User {
#Column("permissions")
#CassandraType(type = DataType.Name.SET, typeArguments = {DataType.Name.TEXT})
public Set<String> permissions = new HashSet<>();
}
In cassandra I have table users with field permissions of type Set. It works fine when I store some values in the set, but when I try to store empty set, it becomes null when I read such entity from the repository.
Is there a way to force spring-data-cassandra to change null to empty HashSet? Or can I somehow add custom reader for this specific property of the entity?
TL;DR;
That's Cassandra's default behavior to return null for empty Collection and Map-typed columns.
Further Read
Cassandra returns null values for lists, sets, and maps, which do not contain any items. This is especially unfortunate when using classes with pre-initialized fields as seen in your question. There's an open ticket (DATACASS-266 - Loading empty collection-typed properties overwrites pre-initialized fields) in the issue tracker - as of now, without comments or votes.
We're not exactly sure whether it's a good idea to skip setting properties or apply some sort of defaulting when dealing with empty (null) collections as this raises follow-up questions what to do when:
Creating an instance through constructor creation: A value is required in such case. For property access, we could omit to set the property, for constructor creation we must provide a value.
The pre-initialized collection contains items but the one received from Cassandra is null.
We assume, the change would be applied, what will happen with already existing code that assumes empty collections default to null.
A possibility to address this behavior could be configuration on MappingCassandraConverter or an extension point to override so users can apply their own empty collection behavior.
I've been trying to eliminate the null collections in my model objects as well, and while it may not be possible to do that at the Spring Data level currently (version 2.1.x), there are some options you can consider:
Use property access for the field in question (i.e. use the annotation #AccessType(PROPERTY)), and in the setter method, set the field to an empty collection when the argument is null.
Define a compatible (see below) constructor that sets the field to an empty collection when a null is provided (and if the model is mutable, you may still want to provide the setter as above).
There are some caveats to ensure Spring Data Cassandra uses the desired constructor (e.g. don't provide a no argument constructor), so it's critical to review the "Object Mapping Fundamentals" section of the reference guide (https://docs.spring.io/spring-data/cassandra/docs/current/reference/html/#mapping.fundamentals).
Among the recommendations in that reference guide (as of version 2.1 at least) is to use an all argument constructor and make model objects immutable, which would work well with the constructor-based approach to handling nulls. Though it does mean writing and maintaining the constructor to handle the nulls rather than relying on Lombok's #AllArgsConstructor.
I have used the property access approach in one case, but not the constructor approach. However I do intend go the constructor route when adding new or model classes (I'm a fan of immutable objects, and will explore that route even without any collection fields)
I believe Spring Data Cassandra 2.0 also added persistence lifecycle callbacks which is another possible option I suppose, but I ruled that out, mainly because the logic would not reside in the model class itself (as well as going against the recommendations from the creators of the framework)

#Cacheable() not returning proper cache

I am well aware that there are multiple questions on this topic, but I just can't get the sense of it. The problem seems to be that #CachePut does not add the new value to the #Cacheable list.
After debugging the problem I found out that the problem seems to be in the key.
Here is the code snippet
#CacheConfig(cacheNames = "documents")
interface DocumentRepository {
#CachePut(key = "#a0.id")
Document save(Document document);
#Cacheable()
List<Document> findAll();
}
So when I invoke the save method, the key being used for caching is incrementing integer, or 1,2,3...
But when I try to get all documents, the cache uses SimpleKey[] as key. If I try to use the same key for #Cacheable, I get SpelEvaluationException, property 'id' cannot be found on null.
So what I am left with at the end is functional cache (the data is saved in the cache), but somehow I am not able to retrieve it.
The underlying cache implementation is EhCache.
I really don't understand what you are expecting here.
It looks like you expect your findAll method to return the full content of the cache named documents. I don't think there is anything in the documentation that can let you conclude that this feature exists (it does not). It is also very fragile. If we were implementing that, findAll would return different results based on the state of the cache. If someone would configure this cache to have a max size of 100 for instance. Or If the cache isn't warm-up on startup.
You can't expect a cache abstraction (or even a cache library) to maintain a synchronized view of "a list of objects". What findAll does is returning the entry that corresponds to a key with no argument (new SimpleKey by default).

Difference between CrudRepository findOne() and JpaRepository getOne()

I read that getOne() is lazy loaded and findOne() fetches the whole entity right away. I've checked the debugging log and I even enabled monitoring on my sql server to see what statements gets executed, I found that both getOne() and findOne() generates and executes the same query. However when I use getOne() the values are initially null (except for the id of course).
So could anyone please tell me, if both methods executes the same query on the database, why should I use one over the other? I'm basically looking for a way to fetch an entity without getting all of its children/attributes.
EDIT1:
Entity code
Dao code:
#Repository
public interface FlightDao extends JpaRepository<Flight, Long> {
}
Debugging log findOne() vs getOne()
EDIT2:
Thanks to Chlebik I was able to identify the problem. Like Chlebik stated, if you try to access any property of the entity fetched by getOne() the full query will be executed. In my case, I was checking the behavior while debugging, moving one line at a time, I totally forgot that while debugging the IDE tries to access object properties for debugging purposes (or at least that's what I think is happening), so debugging triggers the full query execution. I stopped debugging and then checked the logs and everything appears to be normal.
getOne() vs findOne() (This log is taken from MySQL general_log and not hibernate.
Debugging log
No debugging log
It is just a guess but in 'pure JPA' there is a method of EntityManager called getReference. And it is designed to retrieve entity with only ID in it. Its use was mostly for indicating reference existed without the need to retrieve whole entity. Maybe the code will tell more:
// em is EntityManager
Department dept = em.getReference(Department.class, 30); // Gets only entity with ID property, rest is null
Employee emp = new Employee();
emp.setId(53);
emp.setName("Peter");
emp.setDepartment(dept);
dept.getEmployees().add(emp);
em.persist(emp);
I assume then getOne serves the same purpose. Why the queries generated are the same you ask? Well, AFAIR in JPA bible - Pro JPA2 by Mike Keith and Merrick Schincariol - almost every paragraph contains something like 'the behaviour depends on the vendor'.
EDIT:
I've set my own setup. Finally I came to conclusion that if You in any way interfere with entity fetched with getOne (even go for entity.getId()) it causes SQL to be executed. Although if You are using it only to create proxy (eg. for relationship indicator like shown in a code above), nothing happens and there is no additional SQL executed. So I assume in your service class You do something with this entity (use getter, log something) and that is why the output of these two methods looks the same.
ChlebikGitHub with example code
SO helpful question #1
SO helpful question #2
Suppose you want to remove an Entity by id. In SQL you can execute a query like this :
"delete form TABLE_NAME where id = ?".
And in Hibernate, first you have to get a managed instance of your Entity and then pass it to EntityManager.remove method.
Entity a = em.find(Entity.class, id);
em.remove(a);
But this way, You have to fetch the Entity you want to delete from database before deletion. Is that really necessary ?
The method EntityManager.getReference returns a Hibernate proxy without querying the database and setting the properties of your entity. Unless you try to get properties of the returned proxy yourself.
Method JpaRepository.getOne uses EntityManager.getReference method instead of EntityManager.find method. so whenever you need a managed object but you don't really need to query database for that, it's better to use JpaRepostory.getOne method to eliminate the unnecessary query.
If data is not found the table for particular ID, findOne will return null, whereas getOne will throw javax.persistence.EntityNotFoundException.
Both have their own pros and cons. Please see example below:
If data not found is not failure case for you (eg. You are just
verifying if data the data is deleted and success will be data to be
null), you can use findOne.
In another case, you can use getOne.
This can be updated as per your requirements, if you know outcomes.

How to correctly use Spring Data Repository#save()?

In Spring Data Repository interfaces, the following operation is defined:
public T save(T entity);
... and the documentation states that the application should continue working with the returned entity.
I know about the reasoning behind this decision, and it makes sense. I can also see that this works perfectly fine for simple models with independent entities. But given a more complex JPA model with lots of #OneToMany and #ManyToMany connections, the following question arises:
How is the application supposed to use the returned object, when all the rest of the loaded model still references the old one that was passed into save(...)? Also, there might be collections in the application that still contain the old entity. The JVM does not allow to globally "swap" the unsaved entity with the saved one.
So what is the correct usage pattern? Any best practices? I only encountered toy examples so far that do not use #OneToMany or #ManyToMany and thus don't run into this issue. I'm sure that a lot of smart people thought long and hard about this, but I can't see how to use this properly.
This is covered in section 3.2.7.1 of the JPA specification that describes how merge should work. In a nutshell, if the instance being saved is managed (existing), it is simply saved in-place. If not, it is copied to a managed instance (which may not necessarily be a different object since the spec does not mandate that a new instance must be created in this case) and all references from the instance being saved to other managed entities are also updated to refer to the managed instance. This of course requires that the relationships have been correctly defined from the entity being saved.
Indeed, this does not cover the case of storing an entity instance in an unmanaged collection (such as a static collection). That is anyways not advisable because a persisted entity must always be loaded through the persistence provider mechanism (who knows the entity instance may have changed in the persistent store).
Since I have been using JPA for the past many years and have never faced problems, I am confident that the section I have referenced above works well in all scenarios (subject to the JPA provider implementing it as intended). You should try some of the cases that worry you and post separate questions if you run into problems.

Resources