Hibernate : em.getReference().getList().add(record) performance? - performance

When using Hibernate (JPA), if I do the following call :
MyParent parent = em.getReference("myId");
parent.getAListMappedAsOneToMany().add(record)
record.setParent(parent);
Is there any performance problem ?
My thoughts is that getReference does not load the entity and getAListeMappedAsOneToMany().add do not need to load the list as it is defined as lazy fetch.
getAListMappedAsOneToMany could return a very big list if it is really accessed (by calling get or size method).
Could you confirm that there is no performance problem with such a code ?

getReference() doesn't go to the database, and returns a proxy. But if you call a method on the proxy, it initializes the proxy and gets the entity data from the database. So since you call getAListMappedAsOneToMany() on your entity, you don't gain anything by calling getReference() instead of find().
Similarly, the list is loaded lazily. this means that it will only be loaded when you call a method on it. And you do call a method on it: add(). So the data of the elements in the list is also loaded from the database.
Turn on SQL logging in devlopment, to see and understand all the queries executed by Hibernate.
If you want to avoid loading the list, replace your code by
MyParent parent = em.getReference("myId");
record.setParent(parent);
This won't load anything from the database, and it will make the association persistent because Record.parent is the owner side of the association. But beware that this will also make your in-memory object graph inconsistent if the parent has already been loaded before.

getReference() is useful when you don't want to use any members of the object but to give the reference of the object to another object. For example, when entity A referencing entity B and you want to set your b as B of A, then getReference() is what you need.
But in your case, when you get the proxy object, you immediately try to reach one of its members. (aListMappedAsOneToMany) Thus this will result, the whole parent object will be loaded from db.
It is true that, when you getAListMappedAsOneToMany().add(record), it will not load from db yet only if you set inverse="true"
You can learn more information about performance from:
http://docs.jboss.org/hibernate/orm/3.3/reference/en/html/performance.html#performance-collections-mostefficentinverse

Related

Difference between CrudRepository findOne() and JpaRepository getOne()

I read that getOne() is lazy loaded and findOne() fetches the whole entity right away. I've checked the debugging log and I even enabled monitoring on my sql server to see what statements gets executed, I found that both getOne() and findOne() generates and executes the same query. However when I use getOne() the values are initially null (except for the id of course).
So could anyone please tell me, if both methods executes the same query on the database, why should I use one over the other? I'm basically looking for a way to fetch an entity without getting all of its children/attributes.
EDIT1:
Entity code
Dao code:
#Repository
public interface FlightDao extends JpaRepository<Flight, Long> {
}
Debugging log findOne() vs getOne()
EDIT2:
Thanks to Chlebik I was able to identify the problem. Like Chlebik stated, if you try to access any property of the entity fetched by getOne() the full query will be executed. In my case, I was checking the behavior while debugging, moving one line at a time, I totally forgot that while debugging the IDE tries to access object properties for debugging purposes (or at least that's what I think is happening), so debugging triggers the full query execution. I stopped debugging and then checked the logs and everything appears to be normal.
getOne() vs findOne() (This log is taken from MySQL general_log and not hibernate.
Debugging log
No debugging log
It is just a guess but in 'pure JPA' there is a method of EntityManager called getReference. And it is designed to retrieve entity with only ID in it. Its use was mostly for indicating reference existed without the need to retrieve whole entity. Maybe the code will tell more:
// em is EntityManager
Department dept = em.getReference(Department.class, 30); // Gets only entity with ID property, rest is null
Employee emp = new Employee();
emp.setId(53);
emp.setName("Peter");
emp.setDepartment(dept);
dept.getEmployees().add(emp);
em.persist(emp);
I assume then getOne serves the same purpose. Why the queries generated are the same you ask? Well, AFAIR in JPA bible - Pro JPA2 by Mike Keith and Merrick Schincariol - almost every paragraph contains something like 'the behaviour depends on the vendor'.
EDIT:
I've set my own setup. Finally I came to conclusion that if You in any way interfere with entity fetched with getOne (even go for entity.getId()) it causes SQL to be executed. Although if You are using it only to create proxy (eg. for relationship indicator like shown in a code above), nothing happens and there is no additional SQL executed. So I assume in your service class You do something with this entity (use getter, log something) and that is why the output of these two methods looks the same.
ChlebikGitHub with example code
SO helpful question #1
SO helpful question #2
Suppose you want to remove an Entity by id. In SQL you can execute a query like this :
"delete form TABLE_NAME where id = ?".
And in Hibernate, first you have to get a managed instance of your Entity and then pass it to EntityManager.remove method.
Entity a = em.find(Entity.class, id);
em.remove(a);
But this way, You have to fetch the Entity you want to delete from database before deletion. Is that really necessary ?
The method EntityManager.getReference returns a Hibernate proxy without querying the database and setting the properties of your entity. Unless you try to get properties of the returned proxy yourself.
Method JpaRepository.getOne uses EntityManager.getReference method instead of EntityManager.find method. so whenever you need a managed object but you don't really need to query database for that, it's better to use JpaRepostory.getOne method to eliminate the unnecessary query.
If data is not found the table for particular ID, findOne will return null, whereas getOne will throw javax.persistence.EntityNotFoundException.
Both have their own pros and cons. Please see example below:
If data not found is not failure case for you (eg. You are just
verifying if data the data is deleted and success will be data to be
null), you can use findOne.
In another case, you can use getOne.
This can be updated as per your requirements, if you know outcomes.

Fetch annotation in SDG 2.0, fetching strategy questions

Hi all patient developers using spring data graph. Since there is so less documentation and pretty poor test coverage it is sometimes very difficult to understand what is the expected behavior of the underlying framework how the framework is supposed to work. Currently i have some questions related to new fetching approach introduced in SDG 1.1. As opposite to SDG 1.1 write\read through in 2.0 only relations and related object annotated with #Fetch annotation are eagerly fetched others are supposed to be fetched lazily .. and now my first question:
Is it possible to configure SDG so that if the loading of entity and
invoking getter on lazy relation takes place in the same transaction,
requested collection is fetch automatically? Kind of Persistence
Context in transaction scope, or maybe it is planned for the feature
releases.
How can I fetch lazy collection at once for #RelatedTo annotation ? fetch() method on from Neo4jOperation allows to fetch only one entity. Do i have to iterate through whole list and fetch entity for each object? What would be the best way to check if given object is already fetched / initialized or not?
As suggestion i think it would be more intuitive if there will be kind of lazy loading exception thrown instead of getting NPE when working with not initialized objects. Moreover the behavior is misleading since when object is not initialized and all member properties are null apart from id, equals method can provide true for different objects which has not been initialized, which is quite serious issues considering for example appliance of sets
Another issue which i noticed when working with SDG 2.0.0.RC1 is following: when i add new object to not fetched collection sometimes is properly added and persisted,however sometimes is not. I wrote test for this case and it works in non deterministic way. Sometimes it fails sometimes end with success. Here is the use case:
Group groupFromDb = neoTemplate.findOne(group.getId(), Group.class);
assertNotNull(groupFromDb);
assertEquals("Number of members must be equals to 1", 1, groupFromDb.getMembers().size());
User secondMember = UserMappingTest.createUser("secondMember");
groupFromDb.addMember(secondMember);
neoTemplate.save(groupFromDb);
Group groupAfterChange = neoTemplate.findOne(groupFromDb.getId(), Group.class);
assertNotNull(groupAfterChange);
assertEquals("Number of members must be equals to saved entity", groupFromDb.getMembers().size(), groupAfterChange.getMembers().size());
assertEquals("Number of members must be equals to 2", 2, groupAfterChange.getMembers().size());
This test fails sometimes on the last assert, which would mean that sometimes member is added to the set and sometimes not. I guess that the problem lies somewhere in the ManagedFieldAccessorSet, but it is difficult to say since this is non deterministic. I run the test with mvn2 and mvn3 with java 1.6_22 and 1.6_27 and i got always the same result: sometimes is Ok sometimes test fails. Implementation of User equals seems as follows:
#Override
public boolean equals(final Object other) {
if ( !(other instanceof User) ) {
return false;
}
User castOther = (User) other;
if(castOther.getId() == this.getId()) {
return true;
}
return new EqualsBuilder().append(username, castOther.username).isEquals();
}
- I find it also a bit problematic that for objects annotated with #Fetch java HashSet is used which is serializable, while using for lazy loaded fields ManagedFieldAccessorSet is used which is not serializable and causes not serializable exception.
Any help or advice are welcome. Thanks in advance!
I put together a quick code sample showing how to use the fetch() technique Michael describes:
http://springinpractice.com/2011/12/28/initializing-lazy-loaded-collections-with-spring-data-neo4j/
The simple mapping approach was only added to Spring Data Neo4j 2.0, so it is not as mature as the advanced AspectJ mapping. We're currently working on documenting it more extensively.
The lazy loading option was also added lately. So your feedback is very welcome.
Right now SDN doesn't employ a proxy approach for the lazily loaded objects. So the automatic "fetch on access" is not (yet) supported. That's why also no exception is thrown when accessing non-loaded fields and there is no means of "discovering" if an entity was not fully loaded.
In the current snapshot there is the template.fetch() operation to fully load lazy loaded objects and collections.
We'll look into the HashSet vs. ManagedSet issue, it is correct that this is not a good solution.
For the test-case. is the getId() returning a Long object or a long primitive? It might be sensible to use getId().equals(castOther.getId()) here as reference equality is not guaranteed for Number objects.

DBContext, state and original values

I'm working with ASP.NET MVC3 using EF and Code First.
I'm writing a simple issue tracker for practice. In my Controller I have a fairly standard bit of code:
[HttpPost]
public ActionResult Edit(Issue issue) {
if (ModelState.IsValid) {
dbContext.Entry(issue).State = EntityState.Modified
.....
}
}
Question part 1
I'm trying to get my head around how the dbcontext works -
Before I've set the State on the dbContext.Entry(issue), I assume my issue object is detached. Once I set the state to be modified, the object is attached - but to what? the dbContext or the database? I'm kind of missing what this (attaching) actually means?
Question part 2
For argument's sake, let's say I decide to set the "Accepted" field on my issue. Accepted is a boolean. I start with it being false, I'm setting it to true in the form and submitting. At the point that my object is attached, what is the point of the OriginalValues collection? for example if I set a breakpoint just after setting EntityState.Modified but before I call SaveChanges() I can query
db.Entry(issue).OriginalValues["Accepted"]
and this will give me the same value as simply querying the issue object that has been passed in to the Edit....i.e. it is giving the same result as
issue.Accepted
I'm clearly missing something because the documentation says
"The original values are usually the entity's property values as they were when last queried from the database."
But this is not the case because the database is still reporting Accepted as being false (yeah, I noted the word "usually" in the docs but my code is all pretty much standard generated by MS code so....).
So, what am I missing? what is actually going on here?
The context can work only with attached entities. The attaching means that context know about the entity, it can persists its data and in some cases it can provide advanced features like change tracking or lazy loading.
By default all entities loaded from the database by the context instance are attached to that instance. In case of web applications and other disconnected scenarios you have a new context instance for every processed HTTP request (if you don't you are doing a big mistake). Also your entity created by model binder in HTTP POST is not loaded by that context - it is detached. If you want to persist that entity you must attach it and inform context about changes you did. Setting state to Entry to Modified will do both operations - it will attach entity to the context and set its global state to Modified which means that all scalar and complex properties will be updated when you call SaveChanges.
So by setting state to Modified you have attached the entity to the context but until you call SaveChanges it will not affect your database.
OriginalValues are mostly useful in fully attached scenarios where you load entity from the database and make changes to that attached entity. In such case OriginalValues show property values loaded from database and CurrentValues show actual values set by your application. In your scenario context doesn't know original values. It thinks that original values are those used when you attached the entity.

Attaching an object to another context in EF 4

I have a baseclass in my website with a property: CurrentUser.
The get method of this property will create a new context and get a User object from the database based on auth cookie information. So far so good.
But since the context is closed, all I can do outside this, is to call properties directly under User, for example FirstName.
But as soon as I try to get a relation for example, like CurrentUser.UserOffices this won't work since I didn't include UserOffices in the query.
Is there a way to create a new context outside the baseclass which I can attach the CurrentUser object to? I have tried ctx.Attach(CurrentUser) with no luck.
You may wonder why I don't include UserOffices. This is simply because there are very many relations to different tables and I don't want to include them all since it differs between my web pages what relations are needed.
Any ideas?
You can try to Attach your entity and then explicitly load property:
ctx.Attach(CurrentUser);
ctx.LoadProperty(CurrentUser, u => u.UserOffices);
I'm not sure if this works with POCOs.
You can also query for object again with Includes specifing navigation properties you need.
The other choice is simply load UserOffices with Linq-to-entities query restricting where condition to current user.

Mimicking SQL Insert Trigger with LINQ-to-SQL

Using LINQ-to-SQL, I would like to automatically create child records when inserting the parent entity. Basically, mimicking how an SQL Insert trigger would work, but in-code so that some additional processing can be done.
The parent has an association to the child, but it seems that I cannot simply add new child records during the DataContext's SubmitChanges().
For example,
public partial class Parent
{
partial void OnValidate(System.Data.Linq.ChangeAction action)
{
if(action == System.Data.Linq.ChangeAction.Insert)
{
Child c = new Child();
... set properties ...
this.Childs.Add(c);
}
}
}
This would be ideal, but unfortunately the newly created Child record is not inserted to the database. Makes sense, since the DataContext has a list of objects/statements and probably doesn't like new items being added in the middle of it.
Similarly, intercepting the partial void InsertParent(Parent instance) function in the DataContext and attempting to add the Child record yields the same result - no errors, but nothing added to the database.
Is there any way to get this sort of behaviour without adding code to the presentation layer?
Update:
Both the OnValidate() and InsertParent() functions are called from the DataContext's SubmitChanges() function. I suspect this is the inherent difficulty with what I'm trying to do - the DataContext will not allow additional objects to be inserted (e.g. through InsertOnSubmit()) while it is in the process of submitting the existing changes to the database.
Ideally I would like to keep everything under one Transaction so that, if any errors occur during the insert/update, nothing is actually changed in the database. Hence my attempts to mimic the SQL Trigger functionality, allowing the child records to be automatically inserted through a single call to the DataContext's SubmitChanges() function.
If you want it to happen just before it is saved; you can override SubmitChanges, and call GetChangeSet() to get the pending changes. Look for the things you are interested in (for example, delta.Inserts.OfType<Customer>(), and make your required changes.
Then call base.SubmitChanges(...).
Here's a related example, handling deletes.
The Add method only sets up a link between the two objects: it doesn't mark the added item for insertion into the database. For that, you need call InsertOnSubmit on the Table<Child> instance contained within your DataContext. The trouble, of course, is that there's no innate way to access your DataContext from the method you describe.
You do have access to it by implementing InsertParent in your DataContext, so I'd go that route (and use InsertOnSubmit instead of Add, of course).
EDITED I assumed that the partial method InsertParent would be called by the DataContext at some point, but in looking at my own code that method appears to be defined but never referenced by the generated class. So what's the use, I wonder?
In linq to sql you make a "trigger" by making a partial class to the dbml file, and then inserting a partial method. Here is an example that wouldn't do anything because it calls the build-in deletion.
partial void DeleteMyTable(MyTable instance)
{
//custom code here
ExecuteDynamicDelete(instance);
//or here :-)
}

Resources