Recursive linq expressions to get non NULL parent value? - linq

I wrote a simple recursive function to climb up the tree of a table that has ID and PARENTID.
But when I do that I get this error
System.InvalidOperationException: 'The instance of entity type 'InternalOrg' cannot be tracked because another instance with the same key value for {'Id'} is already being tracked. When attaching existing entities, ensure that only one entity instance with a given key value is attached.
Is there another way to do this or maybe done in one LINQ expression ?
private InternalOrgDto GetInternalOrgDto(DepartmentChildDto dcDto)
{
if (dcDto.InternalOrgId != null)
{
InternalOrg io = _internalOrgRepo.Get(Convert.ToInt32(dcDto.InternalOrgId));
InternalOrgDto ioDto = new InternalOrgDto
{
Id = io.Id,
Abbreviation = io.Abbreviation,
Code = io.Code,
Description = io.Description
};
return ioDto;
}
else
{
//manually get parent department
Department parentDepartment = _departmentRepo.Get(Convert.ToInt32(dcDto.ParentDepartmentId));
DepartmentChildDto parentDepartmenDto = ObjectMapper.Map<DepartmentChildDto>(parentDepartment);
return GetInternalOrgDto(parentDepartmenDto);
}
}

Is there a way to get a top-level parent from a given child via Linq? Not that I am aware of. You can do it recursively similar to what you have done, though I would recommend simplifying the query to avoid loading entire entities until you get what you want. I'm guessing from your code that only top level parent departments would have an InternalOrg? Otherwise this method would recurse up the parents until it found one. This could be sped up a bit like:
private InternalOrgDto GetInternalOrgDto(DepartmentChildDto dcDto)
{
var internalOrgid = dcDto.InternalOrgId
?? FindInternalOrgid(dcDto.ParentDepartmentId)
?? throw new InternalOrgNotFoundException();
InternalOrgDto ioDto = _context.InternalOrganizations
.Where(x => x.InternalOrgId == internalOrgId.Value)
.Select(x => new InternalOrgDto
{
Id = x.Id,
Abbreviation = x.Abbreviation,
Code = x.Code,
Description = x.Description
}).Single();
return ioDto;
}
private int? FindInternalOrgid(int? departmentId)
{
if (!departmentId.HasValue)
return (int?) null;
var details = _context.Departments
.Where(x => x.DepartmentId == departmentId.Value)
.Select(x => new
{
x.InternalOrgId,
x.ParentDepartmentId
}).Single();
if (details.InternalOrgId.HasValue)
return details.InternalOrgId;
return findInternalOrgId(details.parentDepartmentId);
}
The key considerations here are to avoid repository methods that return entities or sets of entities, especially where you don't need everything about an entity. By leveraging the IQueryable provided by EF through Linq we can project down to just the data we need rather than returning every field. The database server can accommodate this better via indexing and help avoid things like locks. If you are using repositories to enforce low level domain rules or to enable unit testing then the repositories can expose IQueryable<TEntity> rather than IEnumerable<TEntity> or even TEntity to enable projection and other EF Linq goodness.
Another option to consider where I have hierarchal data where the relationships are important and I want to quickly find all related entities to a parent, or get to a specific level, one option is to store a breadcrumb with each record which is updated if that item is ever moved. The benefit is that these kinds of checks become very trivial to do, the risk is that anywhere/anything that can modify data relationships could leave the breadcrumb trail in an invalid state.
For example, if I have a Department ID 22 which belongs to Department 8 which belongs to Department 2 which is a top-level department, 22's breadcrumb trail would be: "2,8". If the breadcrumbs are empty we have a top-level entity. (and no parent Id) We can parse the breadcrumbs using a simple string.Split() operation. This avoids the recursive trips to the DB entirely. Though you may want a maintenance job running behind the scenes to periodically inspect recently modified data to ensure their breadcrumb trails are accurate and alerting you if any get broken. (Either by faulty code or such)

Related

Better way to handle this code

I am working on a MVC3 application with nhibernate and SQL server. Have written a normal method which is re-usable. Please find the below code and let me know a better way to handle it. I have observed to execute this piece of code it is taking a long time.
private void GetParentCompany(IEnumerable<Company> companiesList)
{
foreach (var company in companiesList)
{
long? dunsUltimateParent = company.DUNSUltimateParent;
Company ultimateParent = _companyService.GetCompanyByDUNS(Convert.ToInt64(dunsUltimateParent));
if (ultimateParent != null)
{
company.UltimateParentName = ultimateParent.CompanyName;
company.UltimateCompanyId = ultimateParent.CompanyId;
company.UltimateParentDuns = ultimateParent.DUNS;
}
}
}
Adding an index to your company.DUNS column might help. However consider to introduce a many-to-one relationship from company to (parent) company.
Place a UltimateParent property with type company in the company class. The fields UltimateParentName and UltimateParentDuns would then be redundant and you could simply get company.UltimateParent.Name for example. The mapping of UltimateParent can be done using 'References' in fluent-nhibernate.
References(x => x.UltimateParent);

How to perform deletion of entities based on a list of ids

I am just starting with linq and entity framework in general and I have a question that may seem naive to all of the advanced users!
I have the following code :
var allDocuments = (from i in companyData.IssuedDocuments select i.IssuedDocumentId).ToList<int>();
var deletedDocuments = allDocuments.Except(updatedDocuments);
and I need to delete all the entities in companyData that their id is stored in deletedDocuments in a disconnected scenario.
Could you please show me a way to do this in an efficient manner?
You could avoid fetching all the ids by specifying you only want deleted ids like this:
var deletedIds = from i in companyData.IssuedDocuments
where !updatedIds.Contains(i.IssuedDocumentId)
select i.IssuedDocumentId
Now if companyData.IssuedDocuments is a DbSet you can tell EF to delete them like this:
foreach (var id in deletedIds)
{
var entity = new MyEntity { Id = id };
companyData.IssuedDocuments.Attach(entity);
companyData.IssuedDocuments.Remove(entity);
}
dbContext.SaveChanges();
This will issue multiple DELETE statements to the database without fetching the full entities into memory.
If companyData.IssuedDocuments is your repository then you could load the full entities instead of just the ids:
var deleted = from i in companyData.IssuedDocuments
where !updatedIds.Contains(i.IssuedDocumentId)
select i
foreach (var entity in deleted)
companyData.IssuedDocuments.Delete(entity);
dbContext.SaveChanges();
Again EF issues multiple DELETE statements to the database
If you can upgrade then EF6 has introduced a RemoveRange method on the DbSet that at you could look at. It may send a single DELETE statement to the database - I haven't tried it yet.
If performance is still an issue then you have to execute sql.
References:
RemoveRange
Deleting an object without retrieving it
How should I remove all elements in a DbSet?
companyData.RemoveAll(x=>deletedDocuments.Contains(x.Id));
I suppose the companyData is a IEnumerable type. The type T contains an Id property, which is the Id of the data. Then deletedDocuments contains the ids of all the documents that we want to remove.
One thing that's important and I should note it here is that the deletion of the documents happens in memory and it doesn't execute it in a db. Otherwise you should provide us with the version of entity framework you use and how you access you implelemnt your CRUD operations against your db.
Firstly I would like to thank you all for your suggestions.
I followed Christos Paisios suggestion but I was getting all kinds of exceptions when I was trying to persist the changes to the DB and the way that I finally managed to solve the issues was by adding the following override in my DbContext class
public override int SaveChanges()
{
var orphanedResponses = ChangeTracker.Entries().Where(
e => (e.State == EntityState.Modified || e.State == EntityState.Added) &&
e.Entity is IssuedDocument &&
e.Reference("CompanyData").CurrentValue == null);
foreach (var orphanedResponse in orphanedResponses)
{
IssuedDocuments.Remove(orphanedResponse.Entity as IssuedDocument);
}
return base.SaveChanges();
}

Entity Framework Code-First: "The ObjectStateManager cannot track multiple objects with the same key."

I'm running into an issue with Entity Framework code-first in MVC3. I'm hitting this exception:
An object with the same key already exists in the ObjectStateManager.
The ObjectStateManager cannot track multiple objects with the same
key.
This is addressed many times on SO, but I'm having trouble utilizing any of the suggested solutions in my situation.
Here is a code sample:
FestORM.SaleMethod method = new FestORM.SaleMethod
{
Id = 2,
Name = "Test Sale Method"
};
FestContext context = new FestContext();
//everything works without this line:
string thisQueryWillMessThingsUp =
context.SaleMethods.Where(m => m.Id == 2).Single().Name;
context.Entry(method).State = System.Data.EntityState.Modified;
context.SaveChanges();
EDITED to clarify: I am attempting to update an object that already exists in the database.
Everything works fine without the query noted in the code. In my application, my controller is instantiating the context, and that same context is passed to several repositories that are used by the controller--so I am not able to simply use a different context for the initial query operation. I've tried to remove the entity from being tracked in the ObjectStateManager, but I can't seem to get anywhere with that either. I'm trying to figure out a solution that will work for both conditions: sometimes I will be updating an object that is tracked by the ObjectStateManager, and sometimes it will happen to have not been tracked yet.
FWIW, my real repository functions look like this, just like the code above:
public void Update(T entity)
{
//works ONLY when entity is not tracked by ObjectStateManager
_context.Entry(entity).State = System.Data.EntityState.Modified;
}
public void SaveChanges()
{
_context.SaveChanges();
}
Any ideas? I've been fighting this for too long...
The problem is that this query
string thisQueryWillMessThingsUp =
context.SaleMethods.Where(m => m.Id == 2).Single().Name;
brings one instance of the SaleMethod entity into the context and then this code
context.Entry(method).State = System.Data.EntityState.Modified;
attaches a different instance to the context. Both instances have the same primary key, so EF thinks that you are trying to attach two different entities with the same key to the context. It doesn't know that they are both supposed to be the same entity.
If for some reason you just need to query for the name, but don't want to actually bring the full entity into the context, then you can do this:
string thisQueryWillMessThingsUp =
context.SaleMethods.Where(m => m.Id == 2).AsNoTracking().Single().Name;
If what you are tying to do is update an existing entity and you have values for all mapped properties of that entity, then the simplest thing to do is to not run the query and just use:
context.Entry(method).State = System.Data.EntityState.Modified;
If you don't want to update all properties, possibly because you don't have values for all properties, then querying for the entity and setting properties on it before calling SaveChanges is an acceptable approach. There are several ways to do this depending on your exact requirements. One way is to use the Property method, something like so:
var salesMethod = context.SaleMethods.Find(2); // Basically equivalent to your query
context.Entry(salesMethod).Property(e => e.Name).CurrentValue = newName;
context.Entry(salesMethod).Property(e => e.SomeOtherProp).CurrentValue = newOtherValue;
context.SaveChanges();
These blog posts contain some additional information that might be helpful:
http://blogs.msdn.com/b/adonet/archive/2011/01/29/using-dbcontext-in-ef-feature-ctp5-part-4-add-attach-and-entity-states.aspx
http://blogs.msdn.com/b/adonet/archive/2011/01/30/using-dbcontext-in-ef-feature-ctp5-part-5-working-with-property-values.aspx
The obvious answer would be that your not actually saving the method object to the database before you call:
//everything works without this line:
string thisQueryWillMessThingsUp = context.SaleMethods.Where(m => m.Id == 2).Single().Name;
However, I think perhaps this is just a bit a code you left out.
What if you make your entities inherit from an abstract class ie.
public abstract class BaseClass
{
public int Id { get; set; }
}
Then update your Repository to
public class Repository<T> where T : BaseClass
{
.....
public void Update(T entity)
{
_context.Entry(entity).State = entity.Id == 0 ? System.Data.EntityState.Added : System.Data.EntityState.Modified;
}
}
Also you might want to not set the ID of your SaleMethod and let it be generated by the database. Problem could also be because SaleMethod Object in the database has Id of 2 and then you try to add another SaleMethod object with Id 2.
The error you see stems from trying to add another SaleMethod object with ID of 2 to the ObjectStateManager.

Entity Framework 4 - List<T> Order By based on T's children's property

I have the following code -
public void LoadAllContacts()
{
var db = new ContextDB();
var contacts = db.LocalContacts.ToList();
grdItems.DataSource = contacts.OrderBy(x => x.Areas.OrderBy(y => y.Name));
grdItems.DataBind();
}
I'm trying to sort the list of the contacts according to the area name that is contained within each contact. When I tried the above, I get "At least one object must implement IComparable.". Is there an easy way instead of writing a custom IComparer?
Thanks!
try this:
public void LoadAllContacts()
{
var db = new ContextDB();
var contacts = db.LocalContacts.ToList();
grdItems.DataSource = contacts.OrderBy(x => x.Areas.OrderBy(y => y.Name).First().Name);
grdItems.DataBind();
}
this will order the contacts by the first area name, after ordering the areas by name.
Hope this helps :)
Edit: fixed error in code. (.First().Name)
I was in a discussion with #AbdouMoumen but in the end I thought I'd provide my own answer :-)
His answer works, but there two performance issues in this code (both in the answer as in the original question).
First, the code loads ALL contacts in the db. This may or may not be a problem, but in general I would recommend NOT to do this. Many modern controls support paging/filtering out of the box, so you'd be better off supplying an not-yet-evaluated IQueryable<T> instead of List<T>. If however you need everything in memory, you should delay the ToList to the last possible moment.
Second, in AbdouMoumen's answer, there is a so-called 'SELECT N+1' problem. Entity Framework will by default use lazy loading to fetch additional properties. I.e. the Areas property will not be fetched from the database until it's accessed. In this case this will happen in the controls 'for loop', while it's ordering the result set by name.
Open up SQL Server Profiler to see what I mean: you will see a SELECT statement for all the contacts, and an additional SELECT statement for each contact that fetches the Areas for that contact.
A much better solution would be the following:
public void LoadAllContacts()
{
using (var db = new ContextDB())
{
// note: no ToList() yet, just defining the query
var contactsQuery = db.LocalContacts
.OrderBy(x => x.Areas
.OrderBy(y => y.Name)
.First().Name);
// fetch all the contacts, correctly ordered in the DB
grdItems.DataSource = contactsQuery.ToList();
grdItems.DataBind();
}
}
Is it one to one relation (Contact->Area)?
if yeah then try the following :
public partial class Contact
{
public string AreaName
{
get
{
if (this.Area != null)
return this.Area.Name;
return string.Empty;
}
}
}
then
grdItems.DataSource = contacts.OrderBy(x => x.AreaName);

Can ExecuteQuery return a DBML generated class without having it fetch all the information for that class?

I have a couple of DBML generated classes which are linked together by an id, e.g.
ClassA {
AID,
XID,
Name
}
ClassB {
AID,
ExtraInfo,
ExtraInfo2
}
In using something like db.ClassAs.Where(XID == x) and iterating through that result,
it ends up executing a query for each of the ClassAs and each of ClassBs, which is slow.
Alternatively, I've tried to use ExecuteQuery to fetch all the info I care about and have that return a ClassA. In iterating over that I end up with it doing the same, i.e. doing alot of individual fetches vs. just 1. If I store it in a ClassC (that is not associated with a DB entity) which has the fields of interest of both ClassA and ClassB, this query is much faster, but it's annoying b/c I just created IMO an unnecessary ClassC.
How can I still use ClassA, which associates to ClassB, and still use ExecuteQuery to run 1 query vs. A*B number of queries?
If you have associations you shouldn't need to use ExecuteQuery().
Here's an example using some imaginary Book Library context and anonymous types for the result:
var results =
Books
.Where(book => book.BookId == 1)
.Select(book =>
new
{
book.Name,
AuthorName = book.Author.Name, //Is a field in an associated table.
book.Publisher, //Is an associtated table.
});
EDIT: without anon types
var results =
Books
.Where(book => book.BookId == 1)
.Select(book =>
new BookResult()
{
BookName = book.Name,
AuthorName = book.Author.Name, //Is a field in an associated table.
Publisher = book.Publisher, //Is an associtated table.
});

Resources