We have a query for about 40 data fields related to customers. The query will often return a large amount of records, say up to 20,000. We only want to use say around the first 500 results. Then, we just want to be able to page through them 10 at a time.
Is LINQ skip and take a reasonable approach for this? Are there any potentnialy performance issues with using this approach vs doing it manually some other way?
Take() without Skip() generates SQL using TOP clause.
Take() with Skip generates SQL using ROW_NUMBER() as shown here.
Also, I'd recommend to use excellent LINQPad tool or LINQ to SQL logging to check generated queries.
Yes, if you're on SQL Server 2005+ it'll generate SQL that uses the ROW_NUMBER() function to make paging efficient (not unlike the SQL that Scott uses in this blog post).
Related
We must create and show at runtime (asp.net mvc) some complex reports from Oracle tables data with millions of records. The reports data must be obtained from groupings and little complex calculations.
So is it better for performance and maintainability of code that do these groupings and calculations via sql query (pl/sql) or via linq?
Thanks for your kindle reply
So is it better for performance and maintainability of code that do
these groupings and calculations via sql query (pl/sql) or via linq?
It depends on what you mean by via linq. If you mean that you fetch the complete table to local memory and then use linq statements to extract the result that you want, then of course SQL statements are faster.
However, if you mean that you use Entity Framework, or something similar, then the answer is not a easy to give.
If you use Entity Framework (or some clone), your tables will be represented by IQueryable<...> instead of IEnumerable<...>. An IQueryable has an Expression and a Provider. The Expression represents the query that must be performed. The Provider knows which system must execute the query (usually a Database Management System) and how to communicate with this system. When the query must be executed, it is the task of the Provider to translate the Expression into the language that the system knows (usually something SQL-like) and to execute the SQL-query.
There are two kinds of IQueryable LINQ statements: those that return an IQueryable<...> of something, and those that return a TResult. The ones that return IQueryable only change the Expression. They are functions that use deferred execution.
Function that do not return an IQueryable, are ToList(), FirstOrDefault(), Any(), Max(), etc. Internally they will call functions that will GetEnumerator() (usually via a foreach), which orders the Provider to translate the Expression and execute the query.
Back to your question
So which one is more efficient, entity framework or SQL? Efficiency is not only the time to perform the queries, it is also the development/testing time, for the first version and for future changes in the software.
If you use an entity-framework (-clone), the SQL-queries created from the Expressions are pretty efficient, depending on the framework manufacturer. If you look at the code, then sometimes the SQL query is not the optimal one, although you'll have to be a pretty good SQL-programmer to improve most queries.
The big advantage above using Entity Framework and LINQ queries above SQL statements is that development times will be shorter. The syntax of the LINQ statements is checked at compile time, SQL statements at run-time. Development and test periods will be shorter.
It is easy to reuse LINQ statements, while SQL statements almost always have to be written especially for the query you want to execute. LINQ statements can be tested without a database on any sequence of items that represent your tables.
My Advice
For most queries you won't notice any difference in execution time between the entity framework query or the SQL query.
If you expect complicated queries and future changes, I'd go for entity framework. With main argument the shorter development time, the better testing possibilities, and the better maintainability.
If you detect some queries where you notice that the execution time is too long, you can always decide to bypass entity framework by executing a SQL query instead of using LINQ.
If you've wrapped your DbContext in a proper repository, where you hide the use cases from their implementations, the users of your repository won't notice the difference.
I have a database(sql server 2005),now there are about 100000 records in the table called users, when I do query use linq to sql, it is very slower and slower.how can I do some operate to improve the speed?
Analyse your query and add some indexes to your table may help.
To get a more specific answer post more specific information (table stucture, indexes you have, the sql code L2S generates, ...)
You could (in order of preference)
Save your query as a stored procedure
Add indexes to your users
table, for what you are querying for/sorting for
Analyze your query
(if it is complicated), see if there's a less-resource-intensive way
of doing it. There are graphical query analyzers to help you.
As a last resort, not use LINQ, but instead ADO.NET Entity Framework, it's significantly faster. But you'll only see performance improvements for crazy stuff, and only if you've already done all of the above.
Use stored procedures and then use linq to sql to get the desired rows, this will give performance.
The best tools at your disposal for analyzing your database access and seeing what needs to be optimized are:
SQL Server Profiler
Graphical Execution Plans
The first one will allow you to see the exact queries being sent to your database from your application, which is especially useful if it turns out that your application is chattier than you think. The second one will allow you to take those queries and see exactly what the SQL server is doing with them.
In the graphical execution plan, look for steps which use a lot of CPU and paths which transfer a lot of records. Those are what you'll want to optimize. It's possible that you're doing a table scan somewhere, which is slow, or maybe joining on many more records than you need somewhere, which is slow, etc.
I have a table:
-- Tag
ID | Name
-----------
1 | c#
2 | linq
3 | entity-framework
I have a class that will have the following methods:
IEnumerable<Tag> GetAll();
IEnumerable<Tag> GetByName();
Should I use a compiled query in this case?
static readonly Func<Entities, IEnumerable<Tag>> AllTags =
CompiledQuery.Compile<Entities, IEnumerable<Tag>>
(
e => e.Tags
);
Then my GetByName method would be:
IEnumerable<Tag> GetByName(string name)
{
using (var db = new Entities())
{
return AllTags(db).Where(t => t.Name.Contains(name)).ToList();
}
}
Which generates a SELECT ID, Name FROM Tag and execute Where on the code. Or should I avoid CompiledQuery in this case?
Basically I want to know when I should use compiled queries. Also, on a website they are compiled only once for the entire application?
You should use a CompiledQuery when all of the following are true:
The query will be executed more than once, varying only by parameter values.
The query is complex enough that the cost of expression evaluation and view generation is "significant" (trial and error)
You are not using a LINQ feature like IEnumerable<T>.Contains() which won't work with CompiledQuery.
You have already simplified the query, which gives a bigger performance benefit, when possible.
You do not intend to further compose the query results (e.g., restrict or project), which has the effect of "decompiling" it.
CompiledQuery does its work the first time a query is executed. It gives no benefit for the first execution. Like any performance tuning, generally avoid it until you're sure you're fixing an actual performance hotspot.
2012 Update: EF 5 will do this automatically (see "Entity Framework 5: Controlling automatic query compilation") . So add "You're not using EF 5" to the above list.
Compiled queries save you time, which would be spent generating expression trees. If the query is used often and you'll save the compiled query, you should definitely use it. I had many cases when the query parsing took more time than the actual round trip to the database.
In your case, if you are sure that it would generate SELECT ID, Name FROM Tag without the WHERE case (which I doubt, as your AllQueries function should return IQueryable and the actual query should be made only after calling ToList) - you shouldn't use it.
As someone already mentioned, on bigger tables SELECT * FROM [someBigTable] would take very long and you'll spend even more time filtering that on the client side. So you should make sure that your filtering is made on the database side, no matter if you are using compiled queries or not.
compiled queries are more helpfull with linq queries with large expression trees say complex queries to gain performance over building expression tree again and again while reusing query. in your case i guess it will save a very little time.
Compiled queries are compiled when the application is compiled and every time you reuse a query often or it is complex you should definitely try compiled queries to make execution faster.
But I would not go for it on all queries as it is a little more code to write and for simple queries it might not be worthwhile.
But for maximum performance you should also evaluate Stored Procedures where you do all the processing on the database server, even if Linq tries to push as much of the work to the db as possible you will have situations where a stored procedure will be faster.
Compiled queries offer a performance improvement, but it's not huge. If you have complex queries, I'd rather go with a stored procedure or a view, if possible; letting the database do it's thing might be a better approach.
Recently I started to learning Entity Framework.
First question made in my mind is:
When we want to use LINQ to fetching data in EF, every query like this:
var a = from p in contacts select p.name ;
will be converts to SQL commands like this :
select name from contacts
does this converting repeat every time that we are querying?
I heard that stored procedures are cached in database, does this event happens in LINQ queries in Entity Framework ?
And at last is my question clear?
I think linq query is converted each time you want to execute it. To improve performance you can use compiled queries.
There are all sorts of optimizations being made, both in the linq expression caching and what SQL server chooses to cache, the only way is to measure your performance speed and memory consumption
To see what SQL is created you can use http://efprof.com/ which I've found quite good. You can get some of this info through SQL profiler, it's just a lot more work.
On a site with a high number of users, should paging be handled in code, or with a stored procedure. If you have employed caching, please include your success factors.
Personally, I never page stuff outside SQL Server. I do this at database level as if you have a million records to be paged, if you retrieve it in application layer and page it there, you are already paying a huge cost.
99.9% of the time, paging should be done on your database server. However, stored procedures are not required to do this, and, in fact, many stored procedure solutions rely on cursors and are quite inefficient. Ideally, use a single SQL statement tailored to your database platform to retrieve just the records you need and no more.
I would do it at database level. Talking about sql server 2005, i would use the new ROW_NUMBER() function, look at:
Paging SQL Server 2005 Results
Where a typical sql would be:
SELECT Row_Number() OVER(ORDER BY UserName) As RowID, UserFirstName, UserLastName
FROM Users
WHERE RowID Between 0 AND 9
Here https://web.archive.org/web/20210510021915/http://aspnet.4guysfromrolla.com/articles/031506-1.aspx you can see how it works and examine a little benchmark by Scott Mitchell.
Most database vendors offer rich paging support at the database. Make use of it ;-p Note that it doesn't have to be a stored-procedure to do this (I'll sideline the ever-running stored-proc vs ad-hoc command debate).
As an aside, many frameworks will also do this for you efficiently. For example, in .NET 3.5 (with LINQ), you can use Skip() and Take() to do paging that is used at the db.
I think It depends on number of records to be paged. For example you have 100 records to be paged I think there is no need SQL paging stuff to do this. I am always trying to keep in my mind KISS principle and Premature optimization.