How can Linq be so incredibly fast? C# - linq

Let's say I have 100 000 objects of type Person which have a date property with their birthday in them.
I place all the objects in a List<Person> (or an array) and also in a dictionary where I have the date as the key and every value is a array/list with persons that share the same birthday.
Then I do this:
DateTime date = new DateTime(); // Just some date
var personsFromList = personList.Where(person => person.Birthday == date);
var personsFromDictionary = dictionary[date];
If I run that 1000 times the Linq .Where lookup will be significantly faster in the end than the dictionary. Why is that? It does not seem logical to me. Is the results being cached (and used again) behind the scenes?

From Introduction to LINQ Queries (C#) (The Query)
... the important point is that in LINQ, the query variable itself takes no action and returns no data. It just stores the information that is required to produce the results when the query is executed at some later point.
This is known as deferred execution. (later down the same page):
As stated previously, the query variable itself only stores the query commands. The actual execution of the query is deferred until you iterate over the query variable in a foreach statement. This concept is referred to as deferred execution...
Some linq methods must iterate the IEnumerable and therefor will execute immediately - methods like Count, Max, Average etc' - all the aggregation methods.
Another way to force immediate execution is to use ToArray or ToList, which will execute the query and store it's results in an array or list.

Related

DayOfWeek in LINQ query

I have a simple list of integers that represent days of week. I am trying to check if Date property of my entity is in the selected days of week. But if I try to pass it in query that targets database like this:
query.Where(e => selectedDaysOfWeek.Contains((int)e.Date.DayOfWeek));
I got the exception:
The LINQ expression ... could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to either AsEnumerable(), AsAsyncEnumerable(), ToList(), or ToListAsync().
If I on the other hand, first execute query by calling ToList() (for example) and then add the same where condition on resulting list, it works:
var items = query.ToList();
items = items.Where(e => selectedDaysOfWeek.Contains((int)e.Date.DayOfWeek)).ToList();
Although in my case this is acceptable, I would like to fetch less items from the database. Is there a way to check DayOfWeek when querying db as was my initial intent?
You are accessing the DayOfWeek property from a DateTime reference in the query.
Entity Framework "doesn't know" how to translate that to SQL thus in the first piece of code you getting an exception.
And in the second piece it is working after you have fetch all of the data from database at the .ToList() call, and the .Where filtering is happening in the memory.
If you wish to implement that logic on the database side, you will have to write your own SQL statement.

EF core 2 first query slow

I'm using EF core 2 as ORM in my project.
I faced this problem while executing this query:
var query = (from droitsGeo in _entities.DroitsGeos
join building in _entities.Batiments
on droitsGeo.IdPerimetre equals building.IdBatiment
where droitsGeo.IdUtilisateur == idUser &&
droitsGeo.IdClient == idClient &&
building.Valide == true &&
droitsGeo.IdNiveauPerimetre == geographicalLevel
orderby sort ascending
select new GeographicalModel
{
Id = building.IdBatiment,
IdParent = building.IdEtablissement,
Label = building.LibBatiment,
});
First execution tooks about 5 second and second less than one second as show below :
First execution of query :
Time elapsed EF: 00:00:04.8562419
After first execution of query :
Time elapsed EF: 00:00:00.5496862
Time elapsed EF: 00:00:00.6658079
Time elapsed EF: 00:00:00.6176030
I have same result using Stored procedure.
When i execute sql query generated by EF in SQL Server, the result is returned in less than a second.
what is wrong with EF Core 2 or did i miss something in configuration?
The EF by default tracks all the entities you run queries against.
When you run it for the first time the track change mechanism kicks in... that's why it takes a little bit longer.
You can avoid this, especially when retrieving collections by using .AsNoTracking() when composing the query.
Take a look:
var items = DbContext.MyDbSet
.Include(SecondObject)
.AsNoTracking()
.ToList();
EF core needs to compile LINQ quires using reflection therefor first queries are always slow. There is already a GitHub issue here
I have a simple idea to resolve this issue with the help of stored procedures and thereafter AutoMapper.
Create an stored procedures that return all the columns that you want, no matter if they are from different tables. Once the data is received from the stored procedure and you have received the object in one of your Model classes, you can then use AutoMapper to map only the relevant attributes to other classes. Please note that I am not giving you a tutorial of how to use stored procedure. I am giving you an example that might explain better:
A stored procedure is created which returns results from three tables named A, B and C.
A model class named SP_Result.cs is created corresponding to created stored procedure to map the received object of stored procedure (this is required when working with stored procedures in EF Core)
'ViewModels` are created having same attributes as returning from each table A, B and C.
Thereafter, mapping configurations will be created for SP_Result with ViewModel of Class A, Class B and Class C. e.g. CreateMap<SP_Result, ViewModel_A>(); CreateMap<SP_Result, ViewModel_B>();. I suppose, you would have a request and response objects which can be used instead of ViewModels. Name the properties accordingly in the stored procedure using AS keyword. e.g. Select std_Name AS 'Name'
This mapping will map the individual properties to each class. AutoMapper ignore the properties which do not exists in either of the classes mentioned in Mapping Configuration.
If you are selecting a list of objects where each object does have its own list of objects, this scenario will generally create N + 1 queries in EF. In fact, if you try to achieve this using stored procedures, you will have to create multiple queries or run the stored procedure multiple times (in a loop may be), or you will end up receiving Cartesian product.

Lightswitch 2013 Linq queries to Get min value

I'm writing a timesheet application (Silverlight) and I'm completely stuck on getting linq queries working. I'm netw to linq and I just read, and did many examples from, a Linq book, including Linq to Objects, linq to SQl and linq to Entities.(I assume, but am not 100% sure that the latter is what Lightswitch uses). I plan to study a LOT more Linq, but just need to get this one query working.
So I have an entity called Items which lists every item in a job and it's serial no
So: Job.ID int, ID int, SerialNo long
I also have a Timesheets entity that contains shift dates, job no and start and end serial no produced
So Job.ID int, ShiftDate date, Shift int, StartNo long, EndNo long
When the user select a job from an autocomplete box, I want to look up the MAX(SerialNo) for that job in the timesheets entity. If that is null (i.e. none have been produced), I want to lookup the MIN(SerialNo) from the Items entity for that job (i.e. what's the first serial no they should produce)
I realize I need a first or default and need to specify the MIN(SerialNo) from Items as a default.
My Timesheet screen uses TimesheetProperty as it's datasource
I tried the following just to get the MAX(SerialNo) from Timesheets entity:
var maxSerialNo =
(from ts in this.DataWorkspace.SQLData.Timesheets
where ts.Job.ID == this.TimesheetProperty.Job.ID
select ts.StartNo).Min();
but I get the following errors:
Instance argument: cannot convert from 'Microsoft.LightSwitch.IDataServiceQueryable' to 'System.Collections.Generic.IEnumerable
'Microsoft.LightSwitch.IDataServiceQueryable' does not contain a definition for 'Min' and the best extension method overload 'System.Linq.Enumerable.Min(System.Collections.Generic.IEnumerable)' has some invalid arguments
I also don't get why I can't use this:
var maxSerialNo = this.DataWorkspace.SQLData.Timesheets.Min(ts => ts.StartNo);
Can anyone point me in the right direction?
Thanks
Mark
IDataServiceQueryable doesn't support full set of LINQ operator like IEnumerable has.
IDataServiceQueryable – This is a LightSwitch-specific type that allows a restricted set of “LINQ-like” operators that are remote-able to the middle-tier and ultimately issued to the database server. This interface is the core of the LightSwitch query programming model. IDataServiceQueryable has a member to execute the query, which returns results that are IEnumerable. [Reference]
Possible solution is, execute your query first to get collection of type IEnumerable by calling .ToList(), then you can call .Min() against the first query result. But that isn't good idea if you have large amount of data, because .ToList() will retrieve all data match the query and do further processing in client side, which is inefficient.
Another way is, change your query using only operators supported by IDataServiceQueryable to avoid retrieving unnecessary data to client. For example, to get minimum StartNo you can try to use orderby descending then get the first data instead of using .Min() operator :
var minStartNo =
(
from ts in this.DataWorkspace.SQLData.Timesheets
where ts.Job.ID == this.TimesheetProperty.Job.ID
orderby ts.StartNo descending select ts
).FirstOrDefault();

Concatenating a LINQ query and LINQ sort statement

I'm having a problem joining two LINQ queries.
Currently, my (original) code looks like this
s.AnimalTypes.Sort((x, y) => string.Compare(x.Type, y.Type));
What I'm needing to do is change this based on a date, then select all data past that date, so I have
s.AnimalTypes.Select(t=>t.DateChanged > dateIn).ToList()
s.AnimalTypes.Sort((…
This doesn't look right as it's not sorting the data selected, rather sorting everything in s.AnimalTypes.
Is there a way to concatenate the two LINQ lines? I've tried
s.AnimalTypes.Select(t=>t.DateChanged > dateIn).ToList().Sort((…
but this gives me an error on the Sort section.
Is there a simple way to do this? I've looked around and Grouo and OrderBy keep cropping up, but I'm not sure these are what I need here.
Thanks
From your description, I believe you want something like:
var results = s.AnimalTypes.Where(t => t.DateChanged > dateIn).OrderBy(t => t.Type);
You can call ToList() to convert to a List<T> at the end if required.
There are a couple of fundamental concepts I believe you are missing here -
First, unlike List<T>.Sort, the LINQ extension methods don't change the original collections, but rather return a new IEnumerable<T> with the filtered or sorted results. This means you always need to assign something to the return value (hence my var results = above).
Second, Select performs a mapping operation - transforming the data from one form to another. For example, you could use it to extract out the DateChanged (Select(t => t.DateChanged)), but this would give you an enumeration of dates, not the original animal types. In order to filter or restrict the list with a predicate (criteria), you'd use Where instead.
Finally, you can use OrderBy to reorder the resulting enumerable.
You are using Select when you actually want to use Where.
Select is a projection from one a collection of one type into another type - you won't increase or reduce the number of items in a collection using Select, but you can instead select each object's name or some other property.
Where is what you would use to filter a collection based on a boolean predicate.

How can I get the IQueryable object used by LinqDataSource?

Is there a way to get the IQueryable object that the LinqDataSource has used to retrieve data? I thought that it might be possible from the selected event, but it doesn't appear to be.
Each row in my table has a category field, and I want to determine how many rows there are per category in the results.
I should also note that I'm using a DataPager, so not all of the rows are being returned. That's why I want to get the IQueryable, so that I can do something like
int count = query.Where(i => i.Category == "Category1").Count();
Use the QueryCreated event. QueryCreatedEventArgs has a Query property that contains the IQueryable.
The event is raised after the original LINQ query is created, and contains the query expression before to it is sent to the database, without the ordering and paging parameters.
There's no "Selected" event in IQueryable. Furthermore, if you're filtering your data on the server, there'd be no way you can access it, even if the API exposed it, but to answer a part of the question, let's say you have category -> product where each category has many products and you want the count of the products in each category. It'd be a simple LINQ query:
var query = GetListOfCategories();
var categoryCount = query.Select(c => c.Products).Count();
Again, depending on the type of object GetListOfCategories return, you might end up having correct value for all the entries, or just the ones that are loaded and are in memory, but that's the different between Linq-to-Objects (in memory) and Linq-to-other data sources (lazy loaded).

Resources