NHibernate second level cache: Query cache doesn't work as expected - caching

The packages I use:
NHibernate 5.2.1
NHibernate.Caches.SysCache 5.5.1
The NH cache config:
<configuration>
<configSections>
<section name="syscache" type="NHibernate.Caches.SysCache.SysCacheSectionHandler,NHibernate.Caches.SysCache" />
</configSections>
<syscache>
<!-- 3.600s = 1h; priority 3 == normal cost of expiration -->
<cache region="GeoLocation" expiration="3600" sliding="true" priority="3" />
</syscache>
</configuration>
I want to query a bunch of locations using their unique primary keys. In this unit test I simulate two requests using different sessions but the same session factory:
[TestMethod]
public void UnitTest()
{
var sessionProvider = GetSessionProvider();
using (var session = sessionProvider.GetSession())
{
var locations = session
.QueryOver<GeoLocation>().Where(x => x.LocationId.IsIn(new[] {147643, 39020, 172262}))
.Cacheable()
.CacheRegion("GeoLocation")
.List();
Assert.AreEqual(3, locations.Count);
}
Thread.Sleep(1000);
using (var session = sessionProvider.GetSession())
{
var locations = session
.QueryOver<GeoLocation>().Where(x => x.LocationId.IsIn(new[] { 39020, 172262 }))
.Cacheable()
.CacheRegion("GeoLocation")
.List();
Assert.AreEqual(2, locations.Count);
}
}
If the exact same IDs are queried in the exact same order, the second call would fetch the objects from the cache. In this example however, the query is called with only two of the previously submitted IDs. Although the locations have been cached, the second query will fetch them from the DB.
I expected the cache to work like a table that is queried first. Only the IDs that have not been cached yet, should trigger a DB call. But obviously the whole query seems to be the hash key for the cached objects.
Is there any way to change that behavior?

There is no notion of a partial query cache, it's all or nothing: if the results for this exact query are found - they are used, otherwise the database is queried. This is because the query cache system does not have specific knowledge about the meaning of the queries (eg. it cannot infer the fact that result of a particular query is a subset of some cached result).
In other words the query cache in NHibernate acts as a document storage rather than a relation table storage. The key for the document is a combination of the query's SQL (in case of linq some textual representation of the expression tree), all parameter types, and all parameter values.
To solve your particular case I would suggest to do some performance testing. Depending on the tests and a dataset size there are some possible solutions: filter cached results on a client (something like following), or not use query cache, or you can implement some caching mechanism for the particular query on the application level.
[TestMethod]
public void UnitTest()
{
var sessionProvider = GetSessionProvider();
using (var session = sessionProvider.GetSession())
{
var locations = session
.QueryOver<GeoLocation>()
.Cacheable()
.CacheRegion("GeoLocation")
.List()
.Where(x => new[] {147643, 39020, 172262}.Contains(x.LocationId))
.ToList();
Assert.AreEqual(3, locations.Count);
}
Thread.Sleep(1000);
using (var session = sessionProvider.GetSession())
{
var locations = session
.QueryOver<GeoLocation>().
.Cacheable()
.CacheRegion("GeoLocation")
.List()
.Where(x => new[] {39020, 172262}.Contains(x.LocationId))
.ToList();
Assert.AreEqual(2, locations.Count);
}
}
More information on how the (N)Hibernate query cache works, can be found here.

Related

Recursive linq expressions to get non NULL parent value?

I wrote a simple recursive function to climb up the tree of a table that has ID and PARENTID.
But when I do that I get this error
System.InvalidOperationException: 'The instance of entity type 'InternalOrg' cannot be tracked because another instance with the same key value for {'Id'} is already being tracked. When attaching existing entities, ensure that only one entity instance with a given key value is attached.
Is there another way to do this or maybe done in one LINQ expression ?
private InternalOrgDto GetInternalOrgDto(DepartmentChildDto dcDto)
{
if (dcDto.InternalOrgId != null)
{
InternalOrg io = _internalOrgRepo.Get(Convert.ToInt32(dcDto.InternalOrgId));
InternalOrgDto ioDto = new InternalOrgDto
{
Id = io.Id,
Abbreviation = io.Abbreviation,
Code = io.Code,
Description = io.Description
};
return ioDto;
}
else
{
//manually get parent department
Department parentDepartment = _departmentRepo.Get(Convert.ToInt32(dcDto.ParentDepartmentId));
DepartmentChildDto parentDepartmenDto = ObjectMapper.Map<DepartmentChildDto>(parentDepartment);
return GetInternalOrgDto(parentDepartmenDto);
}
}
Is there a way to get a top-level parent from a given child via Linq? Not that I am aware of. You can do it recursively similar to what you have done, though I would recommend simplifying the query to avoid loading entire entities until you get what you want. I'm guessing from your code that only top level parent departments would have an InternalOrg? Otherwise this method would recurse up the parents until it found one. This could be sped up a bit like:
private InternalOrgDto GetInternalOrgDto(DepartmentChildDto dcDto)
{
var internalOrgid = dcDto.InternalOrgId
?? FindInternalOrgid(dcDto.ParentDepartmentId)
?? throw new InternalOrgNotFoundException();
InternalOrgDto ioDto = _context.InternalOrganizations
.Where(x => x.InternalOrgId == internalOrgId.Value)
.Select(x => new InternalOrgDto
{
Id = x.Id,
Abbreviation = x.Abbreviation,
Code = x.Code,
Description = x.Description
}).Single();
return ioDto;
}
private int? FindInternalOrgid(int? departmentId)
{
if (!departmentId.HasValue)
return (int?) null;
var details = _context.Departments
.Where(x => x.DepartmentId == departmentId.Value)
.Select(x => new
{
x.InternalOrgId,
x.ParentDepartmentId
}).Single();
if (details.InternalOrgId.HasValue)
return details.InternalOrgId;
return findInternalOrgId(details.parentDepartmentId);
}
The key considerations here are to avoid repository methods that return entities or sets of entities, especially where you don't need everything about an entity. By leveraging the IQueryable provided by EF through Linq we can project down to just the data we need rather than returning every field. The database server can accommodate this better via indexing and help avoid things like locks. If you are using repositories to enforce low level domain rules or to enable unit testing then the repositories can expose IQueryable<TEntity> rather than IEnumerable<TEntity> or even TEntity to enable projection and other EF Linq goodness.
Another option to consider where I have hierarchal data where the relationships are important and I want to quickly find all related entities to a parent, or get to a specific level, one option is to store a breadcrumb with each record which is updated if that item is ever moved. The benefit is that these kinds of checks become very trivial to do, the risk is that anywhere/anything that can modify data relationships could leave the breadcrumb trail in an invalid state.
For example, if I have a Department ID 22 which belongs to Department 8 which belongs to Department 2 which is a top-level department, 22's breadcrumb trail would be: "2,8". If the breadcrumbs are empty we have a top-level entity. (and no parent Id) We can parse the breadcrumbs using a simple string.Split() operation. This avoids the recursive trips to the DB entirely. Though you may want a maintenance job running behind the scenes to periodically inspect recently modified data to ensure their breadcrumb trails are accurate and alerting you if any get broken. (Either by faulty code or such)

WebApi reordering results?

I hav a problem with WebApi and OData. Slowly moving an API over and... now it seems that the framework is reordering the results.
The following code:
[EnableQuery(PageSize = 100, MaxTop = 1000, AllowedQueryOptions = AllowedQueryOptions.All)]
[ODataRoute]
public IEnumerable<Reflexo.Api.GrdJob> Get(ODataQueryOptions options) {
var nodes = Repository.GrdJob
.Include(x=>x.Cluster)
.OrderByDescending(x => x.Id)
.Select(x => new Reflexo.Api.GrdJob() {
Id = x.Id,
Identity = x.Code,
}).AsQueryable();
nodes = (IQueryable<Reflexo.Api.GrdJob>)options.ApplyTo(nodes);
var retval = nodes.ToArray();
return nodes;
}
is as simlple as it gets. Comparing the results in the debugger with what I can see on the screen calling the method... the results have a different order.
Note that I am comparing the db side id fields (id) of both the JSON I see in a browser and the fields in the array named retval. I have imposed an artificial default order - which also get into the SQL (checked) and array (checked).
Just the JSON shows results in a different order.
Am I missing something?
Be aware the EnableQueryAttribute is going to execute ODataQueryOptions.ApplyTo again using a default set of query settings. (See the EnableQueryAttribute.ApplyQuery method source.) Try removing the attribute.
Thanks to the other contributors on this post. I ran into the same issue described above, and managed to disable OData's default sorting by adding the EnsureStableOrdering = false to my [EnableQuery()] attribute.

EclipseLink Cache not getting cached

#NamedQuery(
name=DBConstants.SERVICE_BY_ID, query="select s from Service s where s.id= :id",
hints={
#QueryHint(name = QueryHints.QUERY_RESULTS_CACHE, value = HintValues.TRUE),
#QueryHint(name = QueryHints.QUERY_RESULTS_CACHE_SIZE, value = "500")
})
I am using different entity manager on every query request.
Query query = em.createNamedQuery(DBConstants.SERVICE_BY_ID);
query.setParameter("id", id);
result = (Service) query.getSingleResult();
This is always returning data from database but not from Cache.
How I tested -> I changed some column value in database and when I performed the query, the modified value is coming. Ideally it should return stale data since the cache is not refreshed OR invalidated.
Revision 2:
If we are trying to cache named queries, then using multitenant aware persistence classes will cause problem. How? Here is the way I think it works:
#Cache(
type=CacheType.SOFT,
size=6000,
expiry=600000
)
#Multitenant
#TenantDiscriminatorColumn(...)
#NamedQueries({
#NamedQuery(
name=DBConstants.ALL_SERVICES, query="select s from Service s",
hints={
#QueryHint(name = QueryHints.QUERY_RESULTS_CACHE, value = HintValues.TRUE),
#QueryHint(name = QueryHints.QUERY_RESULTS_CACHE_SIZE, value = "10")
})
})
})
#IdClass(ServicePK.class)
public class Service implements Serializable { ... }
Creation of EMF:
String tenantID = getTenantID();
properties.put(DBConstants.TENANT_ID_CONTEXT_PROPERTY, tenantID);
properties.put(DBConstants.LINK_SESSION_NAME, "session-" + tenantID);
if(emf == null)
emf = Persistence.createEntityManagerFactory("<PERSISTENCE_UNIT_NAME>", properties);
Creation of EM:
em = emf.createEntityManager();
In this case, our named query "ALL_SERVICES" is to search for all services and we want named query cache to cache based on that query. But when we execute this query, EM appends the tenantId to the query i.e.
select s from Service s where s.tenantID = <VALUE>
This is cached in the isolated cache of the L1 cache and is available till you use the same EM instance.
When you create a new EM instance, it checks if the entry is present in the L2 cache before trying to invoke query on DB. L2 caches only based on the query we mentioned but not with the tenantId appended. This causes cache hit miss and the call always goes to backend.
Now... How do I solve this problem? I can remove multitenant annotations and add tenantId myself in the query. I tested it and it works absolutely fine.
But removing multitenant annotation causes lot of condition checks to be done manually especially when we have dependencies on other persistence classes.
I need a better solution. Can anyone help???

How to perform deletion of entities based on a list of ids

I am just starting with linq and entity framework in general and I have a question that may seem naive to all of the advanced users!
I have the following code :
var allDocuments = (from i in companyData.IssuedDocuments select i.IssuedDocumentId).ToList<int>();
var deletedDocuments = allDocuments.Except(updatedDocuments);
and I need to delete all the entities in companyData that their id is stored in deletedDocuments in a disconnected scenario.
Could you please show me a way to do this in an efficient manner?
You could avoid fetching all the ids by specifying you only want deleted ids like this:
var deletedIds = from i in companyData.IssuedDocuments
where !updatedIds.Contains(i.IssuedDocumentId)
select i.IssuedDocumentId
Now if companyData.IssuedDocuments is a DbSet you can tell EF to delete them like this:
foreach (var id in deletedIds)
{
var entity = new MyEntity { Id = id };
companyData.IssuedDocuments.Attach(entity);
companyData.IssuedDocuments.Remove(entity);
}
dbContext.SaveChanges();
This will issue multiple DELETE statements to the database without fetching the full entities into memory.
If companyData.IssuedDocuments is your repository then you could load the full entities instead of just the ids:
var deleted = from i in companyData.IssuedDocuments
where !updatedIds.Contains(i.IssuedDocumentId)
select i
foreach (var entity in deleted)
companyData.IssuedDocuments.Delete(entity);
dbContext.SaveChanges();
Again EF issues multiple DELETE statements to the database
If you can upgrade then EF6 has introduced a RemoveRange method on the DbSet that at you could look at. It may send a single DELETE statement to the database - I haven't tried it yet.
If performance is still an issue then you have to execute sql.
References:
RemoveRange
Deleting an object without retrieving it
How should I remove all elements in a DbSet?
companyData.RemoveAll(x=>deletedDocuments.Contains(x.Id));
I suppose the companyData is a IEnumerable type. The type T contains an Id property, which is the Id of the data. Then deletedDocuments contains the ids of all the documents that we want to remove.
One thing that's important and I should note it here is that the deletion of the documents happens in memory and it doesn't execute it in a db. Otherwise you should provide us with the version of entity framework you use and how you access you implelemnt your CRUD operations against your db.
Firstly I would like to thank you all for your suggestions.
I followed Christos Paisios suggestion but I was getting all kinds of exceptions when I was trying to persist the changes to the DB and the way that I finally managed to solve the issues was by adding the following override in my DbContext class
public override int SaveChanges()
{
var orphanedResponses = ChangeTracker.Entries().Where(
e => (e.State == EntityState.Modified || e.State == EntityState.Added) &&
e.Entity is IssuedDocument &&
e.Reference("CompanyData").CurrentValue == null);
foreach (var orphanedResponse in orphanedResponses)
{
IssuedDocuments.Remove(orphanedResponse.Entity as IssuedDocument);
}
return base.SaveChanges();
}

MVC3 Entity Framework Code First Updating Subset Related List of Items

I have a table of data with a list of key value pairs in it.
Key Value
--------------------
ElementName PrimaryEmail
Email someemail#gmail.ca
Value Content/Images/logo-here.jpg
I am able to generate new items on my client webpage. When, I create a new row on the client and save it to the server by executing the following code the item saves to the database as expected.
public ViewResult Add(CardElement cardElement)
{
db.Entry(obj).State = EntityState.Added;
db.SaveChange();
return Json(obj);
}
Now, when I want to delete my objects by sending another ajax request I get a failure.
public void Delete(CardElement[] cardElements)
{
foreach (var cardElement in cardElements)
{
db.Entry(cardElement).State = EntityState.Deleted;
}
db.SaveChanges();
}
This results in the following error.
Store update, insert, or delete statement affected an unexpected number of rows (0). Entities may have been modified or deleted since entities were loaded. Refresh ObjectStateManager entries.
I have tried other ways of deleting including find by id remove and attach and delete but obviously I am approaching in the right fashion.
I am not sure what is causing your issue, but I tend to structure my deletes as follows:
public void Delete(CardElement[] cardElements)
{
foreach (var cardElement in cardElements)
{
var element = db.Table.Where(x => x.ID == cardElement.ID).FirstOrDefault();
if(element != null)
db.DeleteObject(element);
}
db.SaveChanges();
}
although I tend to do database first development, which may change things slightly.
EDIT: the error you are receiving states that no rows were updated. When you pass an object to a view, then pass it back to the controller, this tends to break the link between the object and the data store. That is why I prefer to look up the object first based on its ID, so that I have an object that is still linked to the data store.

Resources