Entity Framework very slow performance when get the navigation property - performance

There are just two simplest table A and B where A contains many B as below table. Then I create 20000 record of B and add them to one A. Now, if I restart the program and execute the if(a.Bs==null) code line as below, it will cost 120 seconds!!
I don't think 20000 records is a big thing to EF, so is there anyone who can help me about this performance issue?
public class A
{
[Key]
public int EntityId { get; set; }
public virtual ICollection<B> Bs { get; set; }
}
public class B
{
[Key]
public int EntityId { get; set; }
public virtual A A { get; set; }
}
////////////The Context is://///////////////
public class MyDbContext : DbContext
{
public MyDbContext()
: base("MyDbContext")
{
}
public DbSet<A> As { get; set; }
public DbSet<B> Bs { get; set; }
}
///////////Performance issue as below//////////////
using (var db = new MyDbContext())
{
var a = db.As.First();
if (a.Bs == null)\\This line will cost about 120 seconds!!!!!!
{}
}

In your code, when you access a.Bs, all the collection is materialized (by lazy loader) and 20.000 records materialization takes a lot of time. Also, I think that a.Bs is never null.
If you don't need Bs loaded in memory you can find another way to make the check, i.e.
if (a.Bs.Count() == 0) {}

Related

Why is adding objects to my model so slow?

I have a SimID query that joins a given list of cell phone numbers with their respective IDs in my DB, if the number doesn't exist in the DB it returns 0 as an ID. If I watch my query while debugging, it executes perfectly and very quickly. However when I loop through the results of the query and create a new object and add it to my model it takes 3 minutes to create 458 new objects.
I am new to EF and Linq what am I doing wrong? How can I optimize this code to execute faster?
Any help would be greatly appreciated.
//Query to create virtual table of SimID and MSISDN joined on Cache.MSISDN
var SimID = (from ca_msisdn in Cache.MSISDN
join db_simobj in ctx.Sims on ca_msisdn equals db_simobj.Msisdn into Holder
from msisdnresult in Holder.DefaultIfEmpty()
select new { MSISDN = ca_msisdn, ID = (msisdnresult == null || msisdnresult.SimId == 0 ? 0 : msisdnresult.SimId) });
//Loop through virtual tables and add new data to model
foreach (var ToUpdate in SimID)
{
if (ToUpdate.ID == 0)
{
Console.WriteLine("We have found a new MSISDN: " + ToUpdate.MSISDN + " adding it to the model.");
ctx.Sims.Add(new Sim { Msisdn = ToUpdate.MSISDN });
}
}
//My Sim object
public partial class Sim
{
public Sim()
{
this.CDR_Event = new HashSet<CDR_Event>();
}
public long SimId { get; set; }
public long SimStatusId { get; set; }
public Nullable<long> FitmentCentreId { get; set; }
public string Serial { get; set; }
public string Msisdn { get; set; }
public string Puk { get; set; }
public string Network { get; set; }
public Nullable<System.DateTime> SimStatusDate { get; set; }
public Nullable<System.DateTime> ActivationDate { get; set; }
public Nullable<System.DateTime> ExpiryDate { get; set; }
public Nullable<long> APNSimStatusId { get; set; }
public Nullable<System.DateTime> APNActivated { get; set; }
public Nullable<System.DateTime> APNConfirmed { get; set; }
public string Svr { get; set; }
public virtual ICollection<CDR_Event> CDR_Event { get; set; }
}
Your main problem is that every time that you add to the Sims collection on your context you are adding to what the context's change tracker needs to keep track of, which can become a very expensive proposition memory-wise when you start having larger amounts of entities.
You can solve this in one of two ways:
1) Save the data in batches. That way the change tracker never has to keep track of a large amount of added entities.
2) If you don't want to save it right away, keep it as an in-memory list first, and then do #1 when you do go to save those records.
The answer to this question as IronMan84 alluded to; as documented here is to use .AddRange() and add everything all at once as a list outside the foreach instead of one by one inside the loop.
List<Sim> _SIMS = new List<Sim>();
foreach (var ToUpdate in SimID)
{
if (ToUpdate.ID == 0)
{
Console.WriteLine("We have found a new MSISDN: " + ToUpdate.MSISDN + " adding it to the model.");
_SIMS.Add(new Sim { Msisdn = ToUpdate.MSISDN} );
}
}
ctx.Sims.AddRange(_SIMS);

Retrieving information from derived child object collections using LINQ

I have been trying to get a list of all Workflows that have Offices contained in a certain List by Office Id. I can easily get all of the Workflows that have SingleWorkflowSteps because they have only one Office, but have been unable to understand how I would successfully get those contained in a MultiWorkflowStep. All workflow steps have either a SingleWorkflowStep or a MultiWorkflowStep that contains two or more SingleWorkflowSteps. At the time I designed this, it seemed like a logical way to do this but atlas my LINQ-fu is not as good as I thought it was. Can someone please point me in the right directions. Code listed below:
var OfficesToFind = new List<int> (new int[] { 1,3,5,7,9,10,11,12} );
public class Workflow
{
public Workflow()
{
WorkflowSteps = new List<WorkflowStepBase>();
}
public int Id { get; set; }
public virtual ICollection<WorkflowStepBase> WorkflowSteps { get; set; }
}
public abstract class WorkflowStepBase
{
public int Id { get; set; }
public int StatusId { get; set; }
public virtual Workflow Workflow { get; set; }
public virtual Status Status { get; set; }
}
public class MultiWorkflowStep : WorkflowStepBase
{
public MultiWorkflowStep()
{
ChildSteps = new List<SingleWorkflowStep>();
}
public virtual ICollection<SingleWorkflowStep> ChildSteps { get; set; }
}
public class SingleWorkflowStep : WorkflowStepBase
{
public int? ParentStepId { get; set; }
public int OfficeId { get; set; }
public virtual MultiWorkflowStep ParentStep { get; set; }
public virtual Office Office { get; set; }
}
public class Office
{
public int Id { get; set; }
public string Name { get; set; }
}
public class WorkflowService : IWorkflowService<Workflow>
{
private readonly IRepository<Workflow> _workflowService;
private readonly IRepository<SingleWorkflowStep> _singleStepService;
private readonly IRepository<MultiWorkflowStep> _multiStepService;
public WorkflowService(IUnitOfWork uow)
{
_workflowService = uow.GetRepository<Workflow>();
_singleStepService = uow.GetRepository<SingleWorkflowStep>();
_multiStepSercice = uow.GetRepository<MultiWorkflowStep>();
}
// ~ ------- Other CRUD methods here -------- ~
public IEnumerable<Workflow> GetWorkflowFilter(List<int> statuses, List<int> offices...)
{
var query = _workflowService.GetIQueryable(); // returns an IQueryable of dbset
if(statuses.Any())
{
query = query.Where(q => statuses.Contains(q.StatusId));
}
if(offices.Any())
{
// Get all active single steps and the ones that contain the offices
singleSteps = _singleStepService
.Where(s => s.StatusId == (int)Enumerations.StepStatus.ACTIVE)
.Where(s => offices.Contains(s.OfficeId));
// Get all of the parent Workflows for the singleSteps
var workflows = singleSteps.Select(w => w.Workflow);
// Update the query with the limited scope
query = query.Where(q => q.Workflow.Contains(q));
}
return query.ToList();
}
}
OK, after a good night sleep, being all bright-eyed and bushy-tailed, I figured out my own problem. First the updated code was all wrong. Because each derived WorkflowStep has access to the Workflow and each MultiWorkflowStep contains a list of SingleWorkflowSteps - when I get the list of all SingleWorkflowSteps (which would include all from MultiWorkflowStep(s)), I simply needed to get a list of all of the parent Workflows of the SingleWorkflowSteps. Next I updated my query to limit the Workflows that were contained in the new Workflow list and here is the correct code for the GetWorkflowFilter method:
...
if(offices.Any())
{
// Get all active single steps and the ones that contain the offices
singleSteps = _singleStepService.Where(s => s.StatusId == (int)Enumerations.StepStatus.ACTIVE).Where(s => offices.Contains(s.OfficeId));
// Get all of the parent Workflows for the singleSteps
var workflows = singleSteps.Select(w => w.Workflow);
// Update the query with the limited scope
query = query.Where(q => q.Workflow.Contains(q));
}
return query.ToList();
}

Counting performance issue

The class Group is like the following:
class Group {
public string Name { get; set; }
public virtual ICollection<Person> Members { get; set; }
public dynamic AsJson() {
return new
{
groupId = this.GroupId,
name = this.Name,
membersCount = this.Members.Count // issue!
};
}
}
I have a view where all groups are listed, like follows:
Group name Members
Bleh 15
Zort 40
Narf 12
This list is retrieved with an AJAX call which returns a JSON. The action has code like the following:
groups = from g in (db.Groups where ... orderby g.Name).ToList()
select g.AsJson();
The problem is that this list was taking way too long (like 20 seconds).
I changed "this.Members.Count" with a maintained property, that is:
class Group {
public string Name { get; set; }
public virtual ICollection<Person> Members { get; set; }
public int MembersCount { get; set; } // added
public dynamic AsJson() {
return new
{
groupId = this.GroupId,
name = this.Name,
membersCount = this.MembersCount // changed
};
}
}
And it started to work fine (1-2 secs to generate the list)
Is there a better way to achieve this? I'm starting to have problems maintaining the property MembersCount, because Members are added and removed in several parts of the code

EF Code First and linq extension methods - foreign keys, am i doing it wrong?

I have something like this: (pseudocode)
public class Author
{
int id;
public List<Thread> Threads;
public List<ThreadPoints> ThreadPointses;
}
public class Thread
{
int id;
public List<ThreadPoints> ThreadPointses;
}
public class ThreadPoints
{
int id;
int Points;
}
And i am not sure if above is correct, but now i would want to obtain number of points' that specified Author have in specified Thread.
I cannot directly call ThreadPoints.Thread_id, because it's not accessible, even if it physically is in the database.
So do i need to change my model, or am i unaware of some useful methods?
So basically, my model looks like that:
public class Account
{
[Key]
public Guid AccountId { get; set; }
public string UserName { get; set; }
public List<Post> Posts { get; set; }
public List<Post> ModifiedPosts { get; set; }
public List<Thread> Threads { get; set; }
public List<ThreadPoints> ThreadPointses { get; set; }
public List<Thread> LastReplied { get; set; }
public int Points { get; set; }
}
public class Thread
{
[Key]
public int Id { get; set; }
public List<ThreadPoints> ThreadPointses { get; set; }
public List<Post> Posts { get; set; }
public int CurrentValue { get; set; }
public int NumberOfPosts { get; set; }
public int TotalValue { get; set; }
public int Views { get; set; }
public string Title { get; set; }
public DateTime DateCreated { get; set; }
}
public class ThreadPoints
{
[Key]
public int Id { get; set; }
public int Points { get; set; }
}
And what i need, is, when user creates a thread, he gives some amount of points into it. In the next action, i want to take that amount of points (from database), and increase it. But i only have thread id as input information.
Your answer might be good, (as far i am trying to implement it), but anyways, i am not sure about this model. Maybe i should manually add foreign keys into my model? It surely would be simpler, but then i would have two foreign keys in my database...
Since you're not explicitly mapping your FK's, entity framework is generating them and hiding them away, so to get to the Id's of the properties, you'll need to follow the navigation collections.
I'm not sure about your question, but are you wanting the number of Points, inside of a specific Threadpoint for a given author? Your model doesn't seem to support this very well, but you could do something like this-
public int GetPoints(Author author, Thread thread)
{
int points = author.Threads.FirstOrDefault(t => t.id == thread.id).ThreadPointses.Sum(tp => tp.Points);
}
This would return the sum of all the points contained in the list of threadpoints, which are contained in the list of threads with the same id as the thread you passed in, for the specified author.
If this doesn't work for you - can you post your actual model?

How to Count items in nested collection / codefirst EntityFramework

I've got CodeFirst collection defined as defined below.
For any given EmailOwnerId, I want to count the number of EmailDetailAttachments records exist without actually downloading all the images themselves.
I know I can do something like
var emailsToView = (from data in db.EmailDetails.Include("EmailDetailAttachments")
where data.EmailAccount.EmailOwnerId = 999
select data).ToList();
int cnt = 0;
foreach (var email in emailsToView)
{
cnt += email.EmailDetailAttachments.Count();
}
but that means I've already downloaded all the bytes of images from my far away server.
Any suggestion would be appreciated.
public class EmailDetail
{
[Key]
[DatabaseGeneratedAttribute(DatabaseGeneratedOption.Identity)]
public int Id { get; set; }
public int EmailOwnerId {get;set;}
public virtual ICollection<ImageDetail> EmailDetailAttachments { get; set; }
..
}
public class ImageDetail
{
[Key]
[DatabaseGeneratedAttribute(DatabaseGeneratedOption.Identity)]
public int Id { get; set; }
[MaxLengthAttribute(256)]
public string FileName { get; set; }
[MaxLengthAttribute(256)]
public string ContentMimeType { get; set; }
public byte[] ImageDataBytes { get; set; }
public DateTime ImageCreation { get; set; }
}
The engine should be able to update this to a COUNT(*) statement.
var emailsToView = (from data in db.EmailDetails // no Include
where data.EmailAccount.EmailOwnerId = 999
select new {
Detail = data,
Count=data.EmailDetailAttachments.Count() }
).ToList();
But you'll have to verify if this produces the right (and more efficient) SQL.

Resources