LINQ performance improvements - visual-studio-2010

I'm starting to use LINQ, and I'm very happy with its intuitive use, but I'm the kind of person who thinks performance can always be improved.
What's the best way of self analyzing LINQ? Is there a tool within Visual Studio 2010 that will help me analyze my LINQ? Any type of query analyzer, something that will give me accurate numbers so I can see if I'm really improving times.
What's the methods you use for improving performance for LINQ?

LINQPad is your friend.

Related

Entity Framework with a very large and complex dataset

I would appreciate it very much if you can help me with my questions:
Is EF5 reliable and efficient enough to deal with very large and complex dataset in the real world?
Comparing EF5 with ADO.NET, does EF5 requires significantly more resources such as memory?
For those who have tried EF5 on a real world project with very large and complex dataset, are you happy with the performance so far?
As EF creates an abstraction over the data access. Usage of EF introduces number of additional steps to execute any query. But there are workarounds to reduce the cost. As MS is promoting this ORM, i believe they are doing alot for performance improvement as well. EF 6 beta is already out.
There is good article on performance of EF5 available on MSDN for this.
http://msdn.microsoft.com/en-us/data/hh949853
I will not use EF if lot of complex manipulations and iterations are required on the DBsets after populating the DBSets from the EF query.
Hope this helps.
EF is more than capable of handling large amounts of data. Just as in plain ADO.NET, you need to understand how to use it correctly. Its just as easy to write code in ADO.NET that performs poorly. Its also important to remember that EF is built on top of ADO.NET.
DBSets will be much slower with large amounts of data than a code first EF approach. Plain Datareaders could be faster if done correctly.
The correct answer is 'profile'. Code some large objects and profile the differences.
I did some research into this on EF 4.1 and some details might still apply though there have been upgrades in performance to EF5 to keep in mind.
ADO vs EF performance
My conclusions:
-You won't match ADO performance with a framework that has to generate the actual SQL statement dynamically from C# and turn that sql statement into a strongly typed object, there is simply too much going on, even with pre compiled and 'warmed' statements (and performance tests conclude this). This is a happy trade off for many who find it much easier to write and maintain Linq in C# than stored procs and mapping code.
-You can still use stored procedures with equal performance to ADO which to me is the perfect situation. You can write linq to sql in situations where it works, and as soon as you would like more performance, write the sproc and import it to EF to enjoy the best performance possible.
-There is no technical reason to avoid EF accept for the requirements of your project and possibly your teams knowledge and comfort level. Be sure to make sure it fits into your selected design pattern.. EF Anti Patterns to avoid
I hope this helps.

Will usage of LINQ increase day by day or is it that some organizations do not like to use it?

Will usage of LINQ increase day by day or is it that some organizations do not like to use it?
Linq allows you to simplify your code which is always good, as it makes code less fragile and easier to maintain - as long as your intent (as the developer) is clear.
In my experience projects are only light on Linq usage if the development team don't understand the technology fully, or feel that it doesn't fit into their naive views on proper 'OO' architectures and patterns.
This is highly objective and depends on context, but I would say absolutely. If you've built medium sized application both with and without an ORM you will quickly understand the immense benefits LINQ affords. It's hard to imagine an organization building subsequent applications without an ORM in conjunction with LINQ.

Is LINQ to Everything a good abstraction?

There is a proliferation of new LINQ providers. It is really quite astonishing and an elegant combination of lambda expressions, anonymous types and generics with some syntax sugar on top to make it easy reading. Everything is LINQed now from SQL to web services like Amazon to streaming sensor data to parallel processing. It seems like someone is creating an IQueryable<T> for everything but these data sources can have radically different performance, latency, availability and reliability characteristics.
It gives me a little pause that LINQ makes those performance details transparent to the developer. Is LINQ a solid general purpose abstraction or a RAD tool or both?
To me, LINQ is just a way to make code more readable, and hence more maintainable. LINQ does nothing more than takes standard methods and integrates them into the language (hence the name - language integrated query).
It's nothing but a syntax element around normal interfaces and methods - there is no "magic" here, and LINQ-to-something really should (IMO) be treated as any other 3rd party API - you need to understand the cost/benefits of using it just like any other technology.
That being said, it's a very nice syntax helper - it does a lot for making code cleaner, simpler, and more maintainable, and I believe that's where it's true strengths lie.
I see this as similar to the model of multiple storage engines in an RDBMS accepting a common(-ish) language of SQL, in it's design ... but with the added benefit of integreation into the application language semantics. Of course it is good!
I have not used it that much, but it looks sensible and clear when performance and layers of abstraction are not in a position to have a negative impact on the development process (and trust that standards and models wont change wildly).
It is just an interface and implementation that may fit your needs, like all interfaces, abstractions, libraries and implementations, does it fit?... it is all the same answers.
I suppose - no.
LINQ is just a convenient syntax, but not a common RAD tool. In the big projects with complex logic I noticed that developers do more errors in LINQ that in the same instructions they could do if they write the same thing in .NET 2.0 manner. The code is produced faster, it is smaller, but it is harder to find bugs. Sometimes it is not obvious from the first look, at what point the queried collection turns from IQueryable into IEnumerable... I would say that LINQ requires more skilled and disciplined developers.
Also SQL-like syntax is OK for a functional programming but it is a sidestep from object oriented thinking. Sometimes when you see 2 very similar LINQ queries, they look like copy-paste code, but not always any refactoring is possible (or it is possible only by sacrificing some performance).
I heard that MS is not going to further develop LINQ to SQL, and will give more priority to Entities. Is the ADO.NET Team Abandoning LINQ to SQL? Isn't this fact a signal for us that LINQ is not a panacea for everybody ?
If you are thinking about to build a connector to "something", you can build it without LINQ and, if you like, provide LINQ as an additional optional wrapper around it, like LINQ to Entities. So your customers will decide, whether to use LINQ or not, depending on their needs, required performance etc.
p.s.
.NET 4.0 will come with dynamics, and I expect that everybody will also start to use them as LINQ... without taking into considerations that code simplicity, quality and performance may suffer.

Examples on when not to use LINQ

This is sort of a follow up on this question I read:
What is the biggest mistake people make when starting to use LINQ?
The top answer is "That it should be used for everything." That made me wonder what exactly that means.
What are some common examples where someone used LINQ when they should not have?
You shouldn't use LINQ when the alternative is simpler or significantly more efficient.
I would suggest avoiding LINQ anytime it made the code less obvious.
However, in general, I think LINQ makes things easier to follow, not more difficult, so I rarely avoid it.
It is possible for LINQ to be significantly slower than the alternatives, especially if you have a lot of intermediate Lists. However you are talking about some pretty large datasets, so large that I haven't encountered them.
However, one thing to keep in mind is that a well-written LINQ query can also be much faster than the alternative because of the way IEnumerable works.
Finally, using LINQ now will allow you to switch to Parallel LINQ when it is released with little or no changes.
It's still okay to use foreach. :)

LINQ2SQL performance vs. custom DAL vs. NHibernate

Given a straightforward user-driven, high traffic web application (no fancy reporting/BI):
If my utmost goal is performance (not ease of maintainability, ease of queryability, etc) I would surmise that in most cases, a roll-yourown DAL would be the best choice.
However, if i were to choose Linq2SQL or NHibernate, roughly what kind of performance hit would we be talking about? 10%? 20%? 200%? Of the two, which would be faster?
Does anyone have any real world numbers that could shed some light on this? (and yes, I know Stackoverflow runs on Linq2SQL..)
If you know your stuff (esp. in SQL and ADO.NET), then yes - most likely, you'll be able to create a highly tweaked, highly optimized custom DAL for your particular interest and be faster overall than a general-purpose ORM like Linq-to-SQL or NHibernate.
As to how much - that's really really hard to say without knowing your concrete table structure, data and usage patterns. I remember Rico Mariani did some Linq-to-SQL vs. raw SQL comparisons, and his end result was that Linq-to-SQL achieve over 90% of the performance of a highly skilled SQL programmer.
See: http://blogs.msdn.com/ricom/archive/2007/07/05/dlinq-linq-to-sql-performance-part-4.aspx
Not too shabby in my book, especially if you factor in the productivity gains you get - but that's the big trade-off always: productivity vs. raw performance.
Here's another blog post on Entity Framework and Linq-to-SQL compared to DataReader and DataTable performance.
I don't have any such numbers for NHibernate, unfortunately.
In two high traffic web apps refactoring a ORM call to use a stored procedure from ado.net only got us about 1-2% change in CPU and time.
Going from an ORM to a custom DAL is an exercise in micro optimization.

Resources