As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I took a look at the "Beginner's Guide to LINQ" post here on StackOverflow (Beginners Guide to LINQ), but had a follow-up question:
We're about to ramp up a new project where nearly all of our database op's will be fairly simple data retrievals (there's another segment of the project which already writes the data). Most of our other projects up to this point make use of stored procedures for such things. However, I'd like to leverage LINQ-to-SQL if it makes more sense.
So, the question is this: For simple data retrievals, which approach is better, LINQ-to-SQL or stored procs? Any specific pro's or con's?
Thanks.
Some advantages of LINQ over sprocs:
Type safety: I think we all understand this.
Abstraction: This is especially true with LINQ-to-Entities. This abstraction also allows the framework to add additional improvements that you can easily take advantage of. PLINQ is an example of adding multi-threading support to LINQ. Code changes are minimal to add this support. It would be MUCH harder to do this data access code that simply calls sprocs.
Debugging support: I can use any .NET debugger to debug the queries. With sprocs, you cannot easily debug the SQL and that experience is largely tied to your database vendor (MS SQL Server provides a query analyzer, but often that isn't enough).
Vendor agnostic: LINQ works with lots of databases and the number of supported databases will only increase. Sprocs are not always portable between databases, either because of varying syntax or feature support (if the database supports sprocs at all).
Deployment: Others have mentioned this already, but it's easier to deploy a single assembly than to deploy a set of sprocs. This also ties in with #4.
Easier: You don't have to learn T-SQL to do data access, nor do you have to learn the data access API (e.g. ADO.NET) necessary for calling the sprocs. This is related to #3 and #4.
Some disadvantages of LINQ vs sprocs:
Network traffic: sprocs need only serialize sproc-name and argument data over the wire while LINQ sends the entire query. This can get really bad if the queries are very complex. However, LINQ's abstraction allows Microsoft to improve this over time.
Less flexible: Sprocs can take full advantage of a database's featureset. LINQ tends to be more generic in it's support. This is common in any kind of language abstraction (e.g. C# vs assembler).
Recompiling: If you need to make changes to the way you do data access, you need to recompile, version, and redeploy your assembly. Sprocs can sometimes allow a DBA to tune the data access routine without a need to redeploy anything.
Security and manageability are something that people argue about too.
Security: For example, you can protect your sensitive data by restricting access to the tables directly, and put ACLs on the sprocs. With LINQ, however, you can still restrict direct access to tables and instead put ACLs on updatable table views to achieve a similar end (assuming your database supports updatable views).
Manageability: Using views also gives you the advantage of shielding your application non-breaking from schema changes (like table normalization). You can update the view without requiring your data access code to change.
I used to be a big sproc guy, but I'm starting to lean towards LINQ as a better alternative in general. If there are some areas where sprocs are clearly better, then I'll probably still write a sproc but access it using LINQ. :)
I am generally a proponent of putting everything in stored procedures, for all of the reasons DBAs have been harping on for years. In the case of Linq, it is true that there will be no performance difference with simple CRUD queries.
But keep a few things in mind when making this decision: using any ORM couples you tightly to your data model. A DBA has no freedom to make changes to the data model without forcing you to change your compiled code. With stored procedures, you can hide these sorts of changes to an extent, since the parameter list and results set(s) returned from a procedure represent its contract, and the innards can be changed around, just so long as that contract is still met.
And also, if Linq is used for more complex queries, tuning the database becomes a much more difficult task. When a stored procedure is running slow, the DBA can totally focus on the code in isolation, and has lots of options, just so that contract is still satisfied when he/she is done.
I have seen many, many cases where serious problems in an application were addressed by changes to the schema and code in stored procedures without any change to deployed, compiled code.
Perhaps a hybird approach would be nice with Linq? Linq can, of course, be used to call stored procedures.
Linq to Sql.
Sql server will cache the query plans, so there's no performance gain for sprocs.
Your linq statements, on the other hand, will be logically part of and tested with your application. Sprocs are always a bit separated and are harder to maintain and test.
If I was working on a new application from scratch right now I would just use Linq, no sprocs.
For basic data retrieval I would be going for Linq without hesitation.
Since moving to Linq I've found the following advantages:
Debugging my DAL has never been easier.
Compile time safety when your schema changes is priceless.
Deployment is easier because everything is compiled into DLL's. No more managing deployment scripts.
Because Linq can support querying anything that implements the IQueryable interface, you will be able to use the same syntax to query XML, Objects and any other datasource without having to learn a new syntax
LINQ will bloat the procedure cache
If an application is using LINQ to SQL and the queries involve the use of strings that can be highly variable in length, the SQL Server procedure cache will become bloated with one version of the query for every possible string length. For example, consider the following very simple queries created against the Person.AddressTypes table in the AdventureWorks2008 database:
var p =
from n in x.AddressTypes
where n.Name == "Billing"
select n;
var p =
from n in x.AddressTypes
where n.Name == "Main Office"
select n;
If both of these queries are run, we will see two entries in the SQL Server procedure cache: One bound with an NVARCHAR(7), and the other with an NVARCHAR(11). Now imagine if there were hundreds or thousands of different input strings, all with different lengths. The procedure cache would become unnecessarily filled with all sorts of different plans for the exact same query.
More here: https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=363290
I think the pro LINQ argument seems to be coming from people who don't have a history with database development (in general).
Especially if using a product like VS DB Pro or Team Suite, many of the arguments made here do not apply, for instance:
Harder to maintain and Test:
VS provides full syntax checking, style checking, referential and constraint checking and more. It also provide full unit testing capabilities and refactoring tools.
LINQ makes true unit testing impossible as (in my mind) it fails the ACID test.
Debugging is easier in LINQ:
Why? VS allows full step-in from managed code and regular debugging of SPs.
Compiled into a single DLL rather than deployment scripts:
Once again, VS comes to the rescue where it can build and deploy full databases or make data-safe incremental changes.
Don't have to learn TSQL with LINQ:
No you don't, but you have to learn LINQ - where's the benefit?
I really don't see this as being a benefit. Being able to change something in isolation might sound good in theory, but just because the changes fulfil a contract doesn't mean it's returning the correct results. To be able to determine what the correct results are you need context and you get that context from the calling code.
Um, loosely coupled apps are the ultimate goal of all good programmers as they really do increase flexibility. Being able to change things in isolation is fantastic, and it is your unit tests that will ensure it is still returning appropriate results.
Before you all get upset, I think LINQ has its place and has a grand future. But for complex, data-intensive applications I do not think it is ready to take the place of stored procedures. This was a view I had echoed by an MVP at TechEd this year (they will remain nameless).
EDIT: The LINQ to SQL Stored Procedure side of things is something I still need to read more on - depending on what I find I may alter my above diatribe ;)
LINQ is new and has its place. LINQ is not invented to replace stored procedure.
Here I will focus on some performance myths & CONS, just for "LINQ to SQL", of course I might be totally wrong ;-)
(1)People say LINQ statment can "cache" in SQL server, so it doesn't lose performance. Partially true. "LINQ to SQL" actually is the runtime translating LINQ syntax to TSQL statment. So from the performance perspective,a hard coded ADO.NET SQL statement has no difference than LINQ.
(2)Given an example, a customer service UI has a "account transfer" function. this function itself might update 10 DB tables and return some messages in one shot. With LINQ, you have to build a set of statements and send them as one batch to SQL server. the performance of this translated LINQ->TSQL batch can hardly match stored procedure. Reason? because you can tweak the smallest unit of the statement in Stored procedue by using the built-in SQL profiler and execution plan tool, you can not do this in LINQ.
The point is, when talking single DB table and small set of data CRUD, LINQ is as fast as SP. But for much more complicated logic, stored procedure is more performance tweakable.
(3)"LINQ to SQL" easily makes newbies to introduce performance hogs. Any senior TSQL guy can tell you when not to use CURSOR (Basically you should not use CURSOR in TSQL in most cases). With LINQ and the charming "foreach" loop with query, It's so easy for a newbie to write such code:
foreach(Customer c in query)
{
c.Country = "Wonder Land";
}
ctx.SubmitChanges();
You can see this easy decent code is so attractive. But under the hood, .NET runtime just translate this to an update batch. If there are only 500 lines, this is 500 line TSQL batch; If there are million lines, this is a hit. Of course, experienced user won't use this way to do this job, but the point is, it's so easy to fall in this way.
The best code is no code, and with stored procedures you have to write at least some code in the database and code in the application to call it , whereas with LINQ to SQL or LINQ to Entities, you don't have to write any additional code beyond any other LINQ query aside from instantiating a context object.
LINQ definitely has its place in application-specific databases and in small businesses.
But in a large enterprise, where central databases serve as a hub of common data for many applications, we need abstraction. We need to centrally manage security and show access histories. We need to be able to do impact analysis: if I make a small change to the data model to serve a new business need, what queries need to be changed and what applications need to be re-tested? Views and Stored Procedures give me that. If LINQ can do all that, and make our programmers more productive, I'll welcome it -- does anyone have experience using it in this kind of environment?
A DBA has no freedom to make changes
to the data model without forcing you
to change your compiled code. With
stored procedures, you can hide these
sorts of changes to an extent, since
the parameter list and results set(s)
returned from a procedure represent
its contract, and the innards can be
changed around, just so long as that
contract is still met.
I really don't see this as being a benefit. Being able to change something in isolation might sound good in theory, but just because the changes fulfil a contract doesn't mean it's returning the correct results. To be able to determine what the correct results are you need context and you get that context from the calling code.
I think you need to go with procs for anything real.
A) Writing all your logic in linq means your database is less useful because only your application can consume it.
B) I'm not convinced that object modelling is better than relational modelling anyway.
C) Testing and developing a stored procedure in SQL is a hell of a lot faster than a compile edit cycle in any Visual Studio environment. You just edit, F5 and hit select and you are off to the races.
D) It's easier to manage and deploy stored procedures than assemblies.. you just put the file on the server, and press F5...
E) Linq to sql still writes crappy code at times when you don't expect it.
Honestly, I think the ultimate thing would be for MS to augment t-sql so that it can do a join projection impliclitly the way linq does. t-sql should know if you wanted to do order.lineitems.part, for example.
LINQ doesn't prohibit the use of stored procedures. I've used mixed mode with LINQ-SQL and LINQ-storedproc. Personally, I'm glad I don't have to write the stored procs....pwet-tu.
IMHO, RAD = LINQ, RUP = Stored Procs. I worked for a large Fortune 500 company for many years, at many levels including management, and frankly, I would never hire RUP developers to do RAD development. They are so siloed that they very limited knowledge of what to do at other levels of the process. With a siloed environment, it makes sense to give DBAs control over the data through very specific entry points, because others frankly don't know the best ways to accomplish data management.
But large enterprises move painfully slow in the development arena, and this is extremely costly. There are times when you need to move faster to save both time and money, and LINQ provides that and more in spades.
Sometimes I think that DBAs are biased against LINQ because they feel it threatens their job security. But that's the nature of the beast, ladies and gentlemen.
According to gurus, I define LINQ as motorcycle and SP as car.
If you want to go for a short trip and only have small passengers(in this case 2), go gracefully with LINQ.
But if you want to go for a journey and have large band, i think you should choose SP.
As a conclusion, choosing between motorcycle or car is depend on your route (business), length (time), and passengers (data).
Hope it helps, I may be wrong. :D
Also, there is the issue of possible 2.0 rollback. Trust me it has happened to me a couple of times so I am sure it has happened to others.
I also agree that abstraction is the best. Along with the fact, the original purpose of an ORM is to make RDBMS match up nicely to the OO concepts. However, if everything worked fine before LINQ by having to deviate a bit from OO concepts then screw 'em. Concepts and reality don't always fit well together. There is no room for militant zealots in IT.
I'm assuming you mean Linq To Sql
For any CRUD command it's easy to profile the performance of a stored procedure vs. any technology. In this case any difference between the two will be negligible. Try profiling for a 5 (simple types) field object over 100,000 select queries to find out if there's a real difference.
On the other hand the real deal-breaker will be the question on whether you feel comfortable putting your business logic on your database or not, which is an argument against stored procedures.
All these answers leaning towards LINQ are mainly talking about EASE of DEVELOPMENT which is more or less connected to poor quality of coding or laziness in coding. I am like that only.
Some advantages or Linq, I read here as , easy to test, easy to debug etc, but these are no where connected to Final output or end user. This is always going cause the trouble the end user on performance. Whats the point loading many things in memory and then applying filters on in using LINQ?
Again TypeSafety, is caution that "we are careful to avoid wrong typecasting" which again poor quality we are trying to improve by using linq. Even in that case, if anything in database changes, e.g. size of String column, then linq needs to be re-compiled and would not be typesafe without that .. I tried.
Although, we found is good, sweet, interesting etc while working with LINQ, it has shear disadvantage of making developer lazy :) and it is proved 1000 times that it is bad (may be worst) on performance compared to Stored Procs.
Stop being lazy. I am trying hard. :)
For simple CRUD operations with a single data access point, I would say go for LINQ if you feel comfortable with the syntax. For more complicated logic I think sprocs are more efficiant performance-wise if you are good at T-SQL and its more advanced operations. You also have the help from Tuning Advisor, SQL Server Profiler, debugging your queries from SSMS etc.
The outcome can be summarized as
LinqToSql for small sites, and prototypes. It really saves time for Prototyping.
Sps : Universal. I can fine tune my queries and always check ActualExecutionPlan / EstimatedExecutionPlan.
Create PROCEDURE userInfoProcedure
-- Add the parameters for the stored procedure here
#FirstName varchar,
#LastName varchar
AS
BEGIN
SET NOCOUNT ON;
-- Insert statements for procedure here
SELECT FirstName , LastName,Age from UserInfo where FirstName=#FirstName
and LastName=#FirstName
END
GO
http://www.totaldotnet.com/Article/ShowArticle121_StoreProcBasic.aspx
Stored procedure makes testing easier and you can change the query without touching the application code. Also with linq, getting a data does not mean its the right data. And testing the correctness of the data means running the application but with stored procedure it's easy to test without touching the application.
Both LINQ and SQL have their places. Both have their disadvantages and advantages.
Sometimes for complex data retrieval you might need stored procs. And sometimes you may want other people to use your stored proc in Sql Server Management Studio.
Linq to Entities is great for fast CRUD development.
Sure you can build an app using only one or the other. Or you can mix it up. It all comes down to your requirements. But SQL stored procs will no go away any time soon.
Related
I recently working with Oracle database to generate some reports. What I need is to get result sets of specific records (only SELECT statement), sometimes are large records, to be used for generating the report in excel file.
At first, the reports are queried in Views but some of them are slow (have some complex subqueries). I was asked to increase the performance and also fixed some field mapping. I also want to tidy things up, because when I query against View, I must specifically call the right column name. I want to separate the data works into database, and the web app just for passing parameters and call the right result set.
I'm new to Oracle, so which is better to do this kind of task? Using SP or Function? or in what condition that maybe View is better?
Makes no difference whether you compile your SQL in a view, SP or function. It is the SQL itself that matters.
As long as you are able to meet your requirements with the views they should be a good option. If you intend to break-up your queries into multiple ones for achieving better performance then you should go for stored procedures. If you decide to go for stored procedure then it would be advisable to create a package and bundle all the stored procedures together in the package. If your problem is performance then there may not be a silver bullet solution for the same. You will have to work on your queries and design for the same.
If the problem is performance due to complex SELECT query (queries), you can consider tuning the queries. Often you will find queries written 15-20 years ago, which do not use functionality and techniques that were introduced by Oracle in more recent versions (even if the organization spent the big bucks to buy the more recent versions - making it into a waste of money). Honestly, that may be too much of a task for you if you are new at Oracle; also, some slow queries may have been written by people just like you, many years ago - before they had a chance to learn a lot about Oracle and have experience with it.
Another thing, if the reports don't need to use the absolute current state of the underlying tables (for example, if "what was in the tables at the end of the business day yesterday" is acceptable), you can create a materialized view. It will not work any faster than a regular view, but it can run overnight (say), or every six hours, or whatever - so that the further reporting processing from there will not have to wait for the queries to complete. This is one of the main uses of materialized views.
Good luck!
I'm writing an assignment for a databases class, and we're required to migrate our existing relational schema to Oracle objects. This whole debacle has got me wondering, just how widely used are these things? The data model is wonky, the syntax is horrendous, and the object orientation is only about three quarters of the way implemented.
Does anyone actually use this?
For starters some standard Oracle functionality uses Types, for instance XMLDB and Spatial (which includes declaring columns of Nested Table data types).
Also, many PL/SQL developers use types all the time, for declaring PL/SQL collections or pipelined functions.
But I agree few places use Types extensively and build PL/SQL APIs out of them. There are several reasons for this.
Oracle has implemented Objects very slowly. Although they were introduced in version 8.0 it wasn't until 9.2 that they fully supported Inheritance, Polymorphism and user-defined constructors. Proper Object-Oriented Programming is impossible without those features. We didn't get SUPER() until version 11g. Even now there are features missing, most notably the private declarations in the TYPE BODY.
The syntax is often clunky or frustratingly obscure. The documentation doesn't help.
Most people working with Oracle tend to come from the relational/procedural school of programming. This means they tend not to understand OOP, or they fail to understand where it can be useful in database programming. Even when people do come up with a neat idea they find it hard or impossible to implement using Oracle's syntax.
That last point is the key one. We can learn new syntax, we can persuade Oracle to complete the feature set, but it is only worthwhile if we can come up with a use for Types. That means we need problems which can be solved using Inheritance and Polymorphism.
I have worked on one system which used types extensively. It was a data warehouse system, and the data loading sub-system was built out of Types. The underlying rationale was simple:
we need to apply the same business rule template for every table we load, so the process is generic;
every table has its own projection, so the SQL statements are unique for each one.
The Type implementation is clean: the generic process is defined in a Type; the implementation for each table is defined in a Type which inherits from that generic Type. The specific types can be generated from metadata. I presented on this topic at the UKOUG a few years ago, and I have written it up in more detail on my blog.Find out more.
By the way, Relational Theory includes the concept of Domains, which are user-defined data-types, including constraints, etc. No flavour of RDBMS actually supports Domains but Oracle's Type Implementation is definitely a step along the way.
I've never seen the benefit to it, mostly because when I last examined it, your object definitions were immutable once they were used by a table.
So if you had an Address object you used in a Customer table definition, you could never ever change the Address object definition without dropping the Customer table, or having to go through a very wonky conversion.
Objects are fine for data instantiation - like what an application does - but for data storage and set-based manipulation, well, I simply don't see the point.
Many of the other answers have given good examples of where using objects does make sense; in general these are to handle some particular, perhaps complex, types of data. Oracle itself uses them for geospatial data.
What is not commonly done, except it would sadly appear in some college courses, is to use object-based tables instead of regular relational tables to hold regular data like employees and departments something like this:
create type emp_t as object (empno number, ename varchar2(10), ...);
create table emp of emp_t;
While these may be nice simple examples to teach the concepts, I fear they may lead to a new generation of database developers who think this approach is suitable, more modern and therefore better than "old-fashioned" relational tables. It emphatically is not.
I've only heard of it being used one place, and the developers involved were converting to get away from it. I've thought of using it purely in PL/SQL, but as our DBA's won't let us install any Types for fear that we might use them in a table this is unlikely to happen.
Share and enjoy.
It's not too uncommon to see them play a small role somewhere in your system. For example, if you're doing something with Oracle data cartridge. Some times when you need to do something really weird they are necessary.
It is uncommon to see them used extensively in a system. I've seen two different systems use a lot of objects and it was a disaster both times: difficult to use, very slow, and full of bugs.
"Simple" relational methods that use basic tables, rows, and columns are almost always good enough. Every programmer (and program) can understand these concepts, and they are powerful enough for almost any task. Yet you can spend many years trying to fully understand and optimize these methods. Object relational technology adds a huge amount of complexity on top of that for very little gain.
I've used simple types with constructors and a few methods to wrap some functionality of interacting with an existing tcp server. I needed to pass x bytes (raw object) and receive back x bytes (clean object). I could have written some procedure that was particular to my task, but using object type allowed this to be a bit more generic for others. Nothing fancy, very basic OO stuff, create the raw object, populate a handful of its 100 or so properties, call its clean function, and assign the result to a new "clean" object. Anyone else who wanted to call the tcp server could follow the same basic steps, only populating whatever raw values with their data.
Still, in my experience I wouldn't say Oracle is object oriented, but rather has some basic functionality of objects. And as others said, companies don't buy Oracle for it OO capabilities. Don't get too caught up in it with Oracle imo.
I would have to say it's not why people buy Oracle. It's very non-portable/non-standard, and as Adam pointed out has some usage pitfalls as well. I've not personally seen the benefit to it. I don't know how widespread it's usage is, but I can't imagine it's very big. Take a look around this site to see how many questions are asked about it. That may give you some insight.
Well never used them in my practice, never heard anyone using them either, not widely used i guess, It matters when you have an object oriented database, oracle supports OO but is not an OO database. I think people who migrate from OO databases to Oracle use them widely
Looking for a bit of advice on how to optimise one of our projects. We have a ASP.NET/C# system that retrieves data from a SQL2008 data and presents it on a DevExpress ASPxGridView. The data that's retrieved can come from one of a number of databases - all of which are slightly different and are being added and removed regularly. The user is presented with a list of live "companies", and the data is retrieved from the corresponding database.
At the moment, data is being retrieved using a standard SqlDataSource and a dynamically-created SQL SELECT statement. There are a few JOINs in the statement, as well as optional WHERE constraints, again dynamically-created depending on the database and the user's permission level.
All of this works great (honest!), apart from performance. When it comes to some databases, there are several hundreds of thousands of rows, and retrieving and paging through the data is quite slow (the databases are already properly indexed). I've therefore been looking at ways of speeding the system up, and it seems to boil down to two choices: XPO or LINQ.
LINQ seems to be the popular choice, but I'm not sure how easy it will be to implement with a system that is so dynamic in nature - would I need to create "definitions" for each database that LINQ could access? I'm also a bit unsure about creating the LINQ queries dynamically too, although looking at a few examples that part at least seems doable.
XPO, on the other hand, seems to allow me to create a XPO Data Source on the fly. However, I can't find too much information on how to JOIN to other tables.
Can anyone offer any advice on which method - if any - is the best to try and retro-fit into this project? Or is the dynamic SQL model currently used fundamentally different from LINQ and XPO and best left alone?
Before you go and change the whole way that your app talks to the database, have you had a look at the following:
Run your code through a performance profiler (such as Redgate's performance profiler), the results are often surprising.
If you are constructing the SQL string on the fly, are you using .Net best practices such as String.Concat("str1", "str2") instead of "str1" + "str2". Remember, multiple small gains add up to big gains.
Have you thought about having a summary table or database that is periodically updated (say every 15 mins, you might need to run a service to update this data automatically.) so that you are only hitting one database. New connections to databases are quiet expensive.
Have you looked at the query plans for the SQL that you are running. Today, I moved a dynamically created SQL string to a sproc (only 1 param changed) and shaved 5-10 seconds off the running time (it was being called 100-10000 times depending on some conditions).
Just a warning if you do use LINQ. I have seen some developers who have decided to use LINQ write more inefficient code because they did not know what they are doing (pulling 36,000 records when they needed to check for 1 for example). This things are very easily overlooked.
Just something to get you started on and hopefully there is something there that you haven't thought of.
Cheers,
Stu
As far as I understand you are talking about so called server mode when all data manipulations are done on the DB server instead of them to the web server and processing them there. In this mode grid works very fast with data sources that can contain hundreds thousands records. If you want to use this mode, you should either create the corresponding LINQ classes or XPO classes. If you decide to use LINQ based server mode, the LINQServerModeDataSource provides the Selecting event which can be used to set a custom IQueryable and KeyExpression. I would suggest that you use LINQ in your application. I hope, this information will be helpful to you.
I guess there are two points where performance might be tweaked in this case. I'll assume that you're accessing the database directly rather than through some kind of secondary layer.
First, you don't say how you're displaying the data itself. If you're loading thousands of records into a grid, that will take time no matter how fast everything else is. Obviously the trick here is to show a subset of the data and allow the user to page, etc. If you're not doing this then that might be a good place to start.
Second, you say that the tables are properly indexed. If this is the case, and assuming that you're not loading 1,000 records into the page at once and retreiving only subsets at a time, then you should be OK.
But, if you're only doing an ExecuteQuery() against an SQL connection to get a dataset back I don't see how Linq or anything else will help you. I'd say that the problem is obviously on the DB side.
So to solve the problem with the database you need to profile the different SELECT statements you're running against it, examine the query plan and identify the places where things are slowing down. You might want to start by using the SQL Server Profiler, but if you have a good DBA, sometimes just looking at the query plan (which you can get from Management Studio) is usually enough.
I've just started using LINQ to SQL on a mid-sized project, and would like to increase my understanding of what advantages L2S offers.
One disadvantage I see is that it adds another layer of code, and my understanding is that it has slower performance than using stored procedures and ADO.Net. It also seems that debugging could be a challenge, especially for more complex queries, and that these might end up being moved to a stored proc anyway.
I've always wanted a way to write queries in a better development environment, are L2S queries the solution I've been looking for? Or have we just created another layer on top of SQL, and now have twice as much to worry about?
Advantages L2S offers:
No magic strings, like you have in SQL queries
Intellisense
Compile check when database changes
Faster development
Unit of work pattern (context)
Auto-generated domain objects that are usable small projects
Lazy loading.
Learning to write linq queries/lambdas is a must learn for .NET developers.
Regarding performance:
Most likely the performance is not going to be a problem in most solutions. To pre-optimize is an anti-pattern. If you later see that some areas of the application are to slow, you can analyze these parts, and in some cases even swap some linq queries with stored procedures or ADO.NET.
In many cases the lazy loading feature can speed up performance, or at least simplify the code a lot.
Regarding debuging:
In my opinion debuging Linq2Sql is much easier than both stored procedures and ADO.NET. I recommend that you take a look at Linq2Sql Debug Visualizer, which enables you to see the query, and even trigger an execute to see the result when debugging.
You can also configure the context to write all sql queries to the console window, more information here
Regarding another layer:
Linq2Sql can be seen as another layer, but it is a purely data access layer. Stored procedures is also another layer of code, and I have seen many cases where part of the business logic has been implemented into stored procedures. This is much worse in my opinion because you are then splitting the business layer into two places, and it will be harder for developers to get a clear view of the business domain.
Just a few quick thoughts.
LINQ in general
Query in-memory collections and out-of-process data stores with the same syntax and operators
A declarative style works very well for queries - it's easier to both read and write in very many cases
Neat language integration allows new providers (both in and out of process) to be written and take advantage of the same query expression syntax
LINQ to SQL (or other database LINQ)
Writing queries where you need them rather than as stored procs makes development a lot faster: there are far fewer steps involved just to get the data you want
Far fewer strings (stored procs, parameter names or just plain SQL) involved where typos can be irritating; the other side of this coin is that you get Intellisense for your query
Unless you're going to work with the "raw" data from ADO.NET, you're going to have an object model somewhere anyway. Why not let LINQ to SQL handle it for you? I rather like being able to just do a query and get back the objects, ready to use.
I'd expect the performance to be fine - and where it isn't, you can tune it yourself or fall back to straight SQL. Using an ORM certainly doesn't remove the need for creating the right indexes etc, and you should usually check the SQL being generated for non-trivial queries.
It's not a panacea by any means, but I vastly prefer it to either making SQL queries directly or using stored procs.
I must say they are what you have been looking for. It takes some time getting used to it, but once you do you can't think of going back (at least for me).
Regarding linq vs. stored procedures, you can have poor performance on either if you build it wrong. I moved to linq to sql some stored procedures of a client that were awfully coded, so the time dropped from 20secs (totally unaceptable for a web app) to < 1 sec. And much much less code then the stored procedure solution.
Update 1: Also you get a lot of flexibility, as you can limit the columns of what you are getting and it will actually only retrieve that. On the stored procedure solution you have to define a procedure for each column set you are getting, even if the underlying queries are the same.
Just as an update, here are some links on the future of LINQ to SQL:
What is the Future of Linq to SQL
Has Microsoft confirmed their stance on LINQ to SQL end-of-life?
Is LINQ to SQL Dead or Alive?
As a comment in the last link states, LINQ to SQL isn't going to go away, just not "improved upon" at least by Microsoft. Take these comments and posts as you will, just be cautious in your development plans.
We switched over to LINQ2Entity over the Entity Framework environment recently. Before, we had basic SQLadapters. Since the database we are working with is rather small, I cannot comment on the performance of LINQ.
I must admit though, writing queries have become a lot easier, and the addition of Entities, does enable strong typing.
I am creating a data source for reporting model (SQL Server Reporting Services).
The reports requires a lot of joins and calculations (let's say, calculating financial parameters like money spent on this, that, amount A vs amount B)...all this involves subobjects.
It makes a lot of sense to me to write unit tests for this code (i.e. walking through order collection, aggregating info based on business rules and subobjects etc.)
To do this properly, I would expect my code to look approx. like this
foreach (IOrder in Orders)
foreach (IOrderLine in IOrder.Orderlines)
...
return ...
and then test the return value.
But this code is not the SQL which is going to be used in the reporting view...of course...
So I am thinking, I could plug-in a .NET assembly in the database.
The issue here is, of course, performance...I don't want to loop all these objects in C#...too slow.
So, naturally, Linq/Lambda/Expression trees seem to be the answer to me.
As we know, when you are doing Linq to SQL, expression trees are built, and then proper SQL is generated based on them.
So, I could write my code in Linq to Objects, using lambda expressions, unit test this code on sample collections (having expressions compiled to .net), and reuse the same code as Linq to SQL in the DB stored procedure, so that inside SQL Server it would generate proper SQL for me (as Linq to SQL already does)...
Then I could get benefits of both unit-tests and writing domain logic code in C# and high-performing stored procedures for reports.
Possible? Can I use Linq/Lambda in SQL Server CLR Stored procedures? Anyone did it or knows how to make it work?
Am I crazy? Do you know a better way of doing it?
Thanks
P.S. I think now I figured out how this should be done properly. According to Udi Dahan, if I understand him right. Database should be denormalized, and all the calculated fields should be on the objects in the table.
When something is happening on the subobject (OrderLine added), my Customer object should receive an event and recalculate the smart value (cache it and persist).
Then reports go straight-forward, without logic and work fast...
No, you cannot use LINQ/Lambda in SQL CRL Procs - it is based on a different version of .NET and does not support those namespaces.
So, I could write my code in Linq to
Objects, using lambda expressions,
unit test this code on sample
collections (having expressions
compiled to .net), and reuse the same
code as Linq to SQL in the DB stored
procedure, so that inside SQL Server
it would generate proper SQL for me
(as Linq to SQL already does)...
This plan was fine until you suggested the CLR code be called from your stored procs. Running CLR code from the database process itself creates a lot of problems with regards to versioning, configuration and database stability... Too many problems if you do that.
Your motivation was to have the benefit of using stored procs, which are faster in general. If those stored procs are in turn running CLR code, they're not going to be faster than the CLR code running in the local process.
Using the LINQ generated expressions technically consumes more CPU cycles than stored procs. This is because the database engine has to regenerate the execution plan each time a query is ran. Typically your database server is on a separate machine though that is not CPU bound (it will be limited instead by disk or network capacity), so this is not a real performance issue. It could be if you run the database server on the same machine as everything else, but don't try to fix this with something so convoluted until its a real issue.
Udi's suggestion may be appropriate, if you want to decrease the overhead of generating the reports. There are two important side effects though to consider first. First, can afford to increase the performance overhead of the operations that pregenerate the reported fields? A bigger problem is that it couples your reporting logic with the code that runs the target system. This prevents you from being able to update the reporting code without also updating the business code, and presumes the reporting code being running as soon as the reported code is put into production.