Entity Framework with LINQ using CONTAINS in WHERE clause very slow with large integer list - linq

I have a function that first queries a function in the database that returns a list of employees that a user is allowed to see. This run in a couple ms. Using this list I query the employee table to get all the employees that this user is allowed to see. When I run the generated sql in query analyzer it takes only a few milliseconds. When it runs from entity framework it is taking over 8 seconds.
Initially I had a list of allowedEmployees but then read http://mikeinba.blogspot.ca/2009/09/speeding-up-linq-contains-queries-with.html and tried using a HashSet instead but the performace is still really bad. How can I get its performance similar to what it is in sql query analyzer? What am I doing wrong?
I am using SQL 2008 and EF5.0
public IQueryable<Employee> FindAllByWithPermissions(int eID, Expression<Func<Employee, bool>> predicate)
{
if (predicate != null)
{
HashSet<int> allowedEmployees = SecurityRepository.GetPermissableEmployees(eID);
return DataContext.Set<Employee>().Where(predicate).Where(p => allowedEmployees.Contains(p.EmployeeID)).AsQueryable<Employee>(); ;
}
throw new ArgumentNullException("Predicate value must be passed to FindBy<T,TKey>.");
}
It seems that when it writes the sql to have the IN clause it takes a long time. There are a few thousand permissable employees but why does generating the sql for it take so long?

Return an IQueryable instead of a HashSet, and then join it with your Employee IQueryable.
var availableEmployees = SecurityRepository.GetPermissableEmployees(eID);
var allEmployees= DataContext.Set<Employee>();
query = from item in allEmployees.Where(predicate)
join t in availableEmployees on item.EmployeeID equals t.EmployeeID
select item;

Related

Linq mapping EntityDb to Dto is very slow, one record

Linq mapping EntityDb to Dto is very slow.
I get the only ONE record from table with join other tables by Id.
Example:
DataAccess da;
var res = from t1 in da.Table1
join t2 in da.Table2 on t1.rf_table2Id equals t2.Table2Id
join etc...
over 20 joins...
where t1.Table1Id == 20 /*example*/
select new MyDto
{
Id = t1.Table1Id,
Name = t1.Name,
Type = new ReferenceDto()
{
Id = t2.Table2Id,
Name = t2.Name
},
and etc...
over 50 fields
};
The problem in mapping. If I get the record without mapping all fields, only Id,
res.FirstOrDefault() executes in 100-500 milliseconds, fast.
But, if I map all fields the res.FirstOrDefault() takes 3 seconds to execute, which is too slow.
My DTO is structure view.
In SQL Server Profiler, the query runs very fast.
What can I do ?
I need to get more information at the one time, by Id the record.
Solution is:
1) I get the original sql-query from SQL Server Profiler.
2) Create plain DTO class (over 200 fields) for execute query by context.ExecuteStoreQuery.
3) Then, I've done mapping the result plain DTO to my structure DTO Model.
Conclusion is:
About 3 seconds to execute linq statement with mapping to big structure DTO with about 200 fields.
And ONLY 500-600 milliseconds with new solution, described above!!!
Improved by 5-6 times!!!
With large DTO structure linq statement is bad, when we want to get the large record data (with many tables) by Id.
P.S.
Structure DTO Model is class with sub-classes, example:
select new MyDto
{
Id = t1.Table1Id,
Name = t1.Name,
Type = new ReferenceDto()
{
Id = t2.Table2Id,
Name = t2.Name
},
and etc...
over 50 fields
};

If a linq query returns empty does it return null?

I have the following linq query:
vm.logs = (from l in db.ActivityLogs
orderby l.Time
select l).Take(2);
If the db table is empty will this return null?
If not how can I detect if a query did return any information?
It will return an IEnumerable<ActivityLog> with no elements.
To check if there are any elements, use the Any() method:
if(!logs.Any())
{
Console.WriteLine("No elements found.");
}
Also note that as you've written it, vm.logs will be lazily evaluated, that is, it won't be fetched from the database until it is used. If you first do a .Any() and then later access the contents of the query, there will be two queries executed in the database. To avoid that, materialize (force the query to execute) by adding a ToList() to the query:
vm.logs = (from l in db.ActivityLogs
orderby l.Time
select l).Take(2).ToList();

How do I get a Distinct list to work with EF 4.x DBSet Context and the IEqualityComparer?

I have been trying for hours to get a Distinct to work for my code.
I am using EF 4.3, MVC3, Razor and trying to get a list downto product id and name.
When I run the Sql query against the DB, it's fine.
Sql Query is
SELECT DISTINCT [ProductId]
,[Product_Name]
FROM [dbo].[PRODUCT]
The only other column in that table is a country code so that's why a standard distinct() isn't working.
I have gone as far as creating an IEqualityComparer
Here is code:
public class DistinctProduct : IEqualityComparer<PRODUCT>
{
public bool Equals(PRODUCT x, PRODUCT y)
{
return x.ProductId.Equals(y.ProductId);
}
public int GetHashCode(PRODUCT obj)
{
return obj.ProductId.GetHashCode();
}
}
here is where I called it.
IEqualityComparer<PRODUCT> customComparer = new DistinctProduct();
IEnumerable<PRODUCT> y = db.PRODUCTs.Distinct(customComparer);
But when it hit's that Last line I get an error out of it stating...
LINQ to Entities does not recognize the method 'System.Linq.IQueryable`1[MyService.Models.PRODUCT] Distinct[PRODUCT](System.Linq.IQueryable`1[MyService.Models.PRODUCT], System.Collections.Generic.IEqualityComparer`1[MyService.Models.PRODUCT])' method, and this method cannot be translated into a store expression.
Can anyone tell me what I'm doing wrong?
Thanks,
David
Is there any reason you could just not use a distinct like the following?
var distinctProdcts = (from p in db.PRODUCTs
select new {
ProductId = p.ProductId,
Product_Name = p.ProductName
}).Distinct();
This would remove the country code from the query before you do the distinct.
Entity Framework is trying to translate your query to a SQL query. Obviously it does not know how to translate the IEqualityComparerer. I think the question is whether you want to do the Distinct in the datbase (in which case your client gets only filtered results) or you are OK with bringing all the data to the client and select distinct on the client. If you want the filtering to happen on the database side (which will make your app perform much better) and you want to be able to use different strategies for comparing you can come up with a code that builds distinct criteria on top of your query. If you are fine with bringing your data to the client (note that it can be a lot of data) you should be able just to do (.ToList() will trigger querying the database and materializing results):
IEnumerable<PRODUCT> y = db.PRODUCTs.ToList().Distinct(customComparer);

Create a linq subquery returns error "Local sequence cannot be used in LINQ to SQL implementations of query operators except the Contains operator"

I have created a linq query that returns my required data, I now have a new requirement and need to add an extra field into the returned results. My entity contains an ID field that I am trying to map against another table without to much luck.
This is what I have so far.
Dictionary<int, string> itemDescriptions = new Dictionary<int, string>();
foreach (var item in ItemDetails)
{
itemDescriptions.Add(item.ItemID, item.ItemDescription);
}
DB.TestDatabase db = new DB.TestDatabase(Common.GetOSConnectionString());
List<Transaction> transactionDetails = (from t db.Transactions
where t.CardID == CardID.ToString()
select new Transaction
{
ItemTypeID= t.ItemTypeID,
TransactionAmount = t.TransactionAmount,
ItemDescription = itemDescriptions.Select(r=>r.Key==itemTypeID).ToString()
}).ToList();
What I am trying to do is key the value from the dictonary where the key = itemTypeID
I am getting this error.
Local sequence cannot be used in LINQ to SQL implementations of query operators except the Contains operator.
What do I need to modify?
This is a duplicate of this question. The problem you're having is because you're trying to match an in-memory collection (itemDescriptions) with a DB table. Because of the way LINQ2SQL works it's trying to do this in the DB which is not possible.
There are essentially three options (unless I'm missing something)
1) refactor your query so you pass a simple primitive object to the query that can be passed accross to the DB (only good if itemDescriptions is a small set)
2) In your query use:
from t db.Transactions.ToList()
...
3) Get back the objects you need as you're doing, then populate ItemDescription in a second step.
Bear in mind that the second option will force LINQ to evaluate the query and return all transactions to your code that will then be operated on in memory. If the transaction table is large this will not be quick!

Linq to NHibernate generating 3,000+ SQL statements in one request!

I've been developing a webapp using Linq to NHibernate for the past few months, but haven't profiled the SQL it generates until now. Using NH Profiler, it now seems that the following chunk of code hits the DB more than 3,000 times when the Linq expression is executed.
var activeCaseList = from c in UserRepository.GetCasesByProjectManagerID(consultantId)
where c.CompletionDate == null
select new { c.PropertyID, c.Reference, c.Property.Address, DaysOld = DateTime.Now.Subtract(c.CreationDate).Days, JobValue = String.Format("£{0:0,0}", c.JobValue), c.CurrentStatus };
Where the Repository method looks like:
public IEnumerable<Case> GetCasesByProjectManagerID(int projectManagerId)
{
return from c in Session.Linq<Case>()
where c.ProjectManagerID == projectManagerId
select c;
}
It appears to run the initial Repository query first, then iterates through all of the results checking to see if the CompletionDate is null, but issuing a query to get c.Property.Address first.
So if the initial query returns 2,000 records, even if only five of them have no CompletionDate, it still fires off an SQL query to bring back the address details for the 2,000 records.
The way I had imagined this would work, is that it would evaluate all of the WHERE and SELECT clauses and simply amalgamate them, so the inital query would be like:
SELECT ... WHERE ProjectManager = #p1 AND CompleteDate IS NOT NULL
Which would yield 5 records, and then it could fire the further 5 queries to obtain the addresses. Am I expecting too much here, or am I simply doing something wrong?
Anthony
Change the declaration of GetCasesByProjectManagerID:
public IQueryable<Case> GetCasesByProjectManagerID(int projectManagerId)
You can't compose queries with IEnumerable<T> - they're just sequences. IQueryable<T> is specifically designed for composition like this.
Since I can't add a comment yet. Jon Skeet is right you'll want to use IQueryable, this is allows the Linq provider to Lazily construct the SQL. IEnumerable is the eager version.

Resources