HANA CDS Views vs Calculation Views vs Table Functions - view

In SAP HANA I am used to create Calculation Views.
Previously I learned that Calculation Views (which after compilation are column-views) are to be prefered over Database-SQL-Views.
Now with CDS-Views I am not sure if this is still the case. Especially with regards to performance.
Also what is now the difference between a table function (which replaced scripted calculation views) and CDS Views?

Ok, this is a question that I believe requires some background to be answered.
A long, long time ago...
When SAP HANA was first developed, it heavily reused concepts and technology from other, already existing SAP products (TREX, P*TIME, MaxDB, Business Warehouse Accelerator).
One of the fundamental elements of the high query performance was (and is) the column store data-storage, which came in large parts from the TREX/BWA products. These products, in turn, had been solutions to very specific problems (full-text search for catalogs and speed-up of analytical queries from the SAP Business Warehouse data warehouse product).
Especially the BWA use case reflects in the column views of SAP HANA. Due to the limited use case of supporting SAP BW queries, no general SQL/relational query support was required (e.g. no arbitrary join-chain optimizations, no SQL features beyond SQL:92 etc.) whereas other, rather exotic features (like "vertical join") that could be used by SAP BW, were built into a query tool/engine ("engine" clearly was a very popular term with the SAP developers).
Once HANA proved successful as a platform to run SAP BW on, the next step was to add flexibility and make more general platforms like SAP Netweaver (the software that SAP's business solution products run on/with) working on SAP HANA. Now, SQL features were added and those required additional capabilities from the query optimizer and execution "engines".
Query optimization had to be flexible and fast and should lead to query performance that would still beat the existing RDBMS vendors' offering (which had been around for 40+ years).
This, clearly, is a hard problem and throwing is operational aspects of DB development (scaling, solution deployment, data federation, etc.).
This led to an overlapping development of different tools addressing different aspects of DB development.
SQL support and the underlying SQL optimizer were made more powerful, so much so, that (some) SQL queries could be as fast or faster than those modeled in calculation views. And since both of these "query frontends" eventually had to talk to the same internal data structures (row/column store) it was desirable to have just a single query optimizer, that would support all the different use cases.
Somewhere around HANA 1 SPS11/12 most calculation views started to be "unrolled" internally to feed into the common optimizer (that was what the "Execute in SQL Engine" flag was about).
I'd say, since then, the performance argument for using calculation views only holds in very specific circumstances.
I mentioned the overlapping developments and CDS (core data services) is one of them. The idea here is a very different one from SQL. While SQL gives you "the way to talk to the database", CDS wants to give your application a single data definition, that is used by the UI, the program logic and the data storage/query execution.
SQL != CDS
This probably needs some context (again): a major usage pattern of how SQL databases are used by application developers is that the application is written in some form of OO-implementation and the talking to the DB is left to a mapping layer/library (e.g. O/R-mappers). This means, that the knowledge of what the application is about (aka business process knowledge), is spread out in the application.
There is some information about it in the UI (labels, formatting, visibility, ...), some of it is in the application-object model (object dependencies, hierarchies, value domains...) and then some of it is in the queries against the database.
Such scattered knowledge/definition makes it hard to make changes consistent, which in turn, slows the development process and in turn prolongs the time until the application can run and deliver some positive outcome.
"Time-to-value" is the thing under optimization here as this is important for companies that give the promise of "success through innovation".
Ok, so this CDS thing is now part of the development models proposed by SAP and nearly en-passant also addresses topics like schema evolution and deployment of the data model. It is, in fact, independent of the actual database platform as shown in the CDS for ABAP variety.
How does this lead back to query performance? It does not really.
CDS' advantage is that one can provide more information about the data model than what is possible in HANA SQL.
Associations and joins with cardinality declaration (albeit now retrofitted to plain SQL) can enable the optimizer to use additional optimizations. Yet, the same optimizer and the same query execution "engines" are used here.
So, from a (query execution) performance point of view, it does not make a big difference, as long as no query semantics are required for which CDS does not have syntax (e.g. some window functions).
The main point of CDS really is about application development process performance and whether that works well with how you do development really depends on how much of it you can use.
Now for the question "scripted calc view" vs. "table function" vs. "CDS view".
Looking at these different object types from the point of "what can I do with them functionally?" will result in the observation "basically, the same".
The difference lies in how these can be optimized (scripted calc views cannot be generally unrolled into the global query to be optimized), and what one can do with the object once created.
Table functions allow for very easy reuse across multiple views and queries. They also provide the option to provide parameters into the function (similar to parameterized views) and in addition allow for imperative coding.
Functionally speaking, table functions are a kind of swiss-army knife; one can do nearly anything with them and they still can be part of global query optimization.
CDS views, as mentioned above, are nothing "special" in terms of query runtime or optimization. The main reason why CDS views are "a thing" is that with HANA SAP started to develop development models (such as XS, XSA, CAM) that revolve around "virtual data models".
The idea for those is that the structure of tables very often is stable and changes only little over time.
In a way, this is the "write-schema" of applications that enter the data into tables.
The "read-schema" is most of the time different from that. Queries re-combine the normalized data into records that the application can map into objects. This allows applications to look at the data differently than the original application.
With "virtual data models" these queries are baked into tangible development artifacts (the views) that can be reused across the application. In fact, these can be treated as if this was the database with its tables, presented in a way that makes sense for the application.
Once again, if that is something that is beneficial for your application development depends on how your application development looks like.
Can you use HANA without CDS? Absolutely, and there are many areas where CDS lacks (i.e. the limited syntax and feature mapping to HANA features) but it does have its merits.
Should you abandon calculation views?
I would not necessarily change existing developments if they still serve their purpose, but calculation views certainly are an odd development object. Training folks in using those and SQL most likely is overly expensive compared to just sticking to SQL.
Personally, I prefer the code-based SQL development (better tooling available, allows for easier comparison with other DBMS, doesn't require WEB IDE/HANA Studio).
The only thing, SQL based development does not provide is the extended annotations/semantic information used by the SAP analytic frontend tools (SAC & BO) - these really are specific to CDS and Information Models (calculation views) but barely used by other analytic tools.
And that's my take on it.

I would add that
Calculation Views are semantically richer. A SQL View does not know about measures, dimensions, hierarchies. https://blogs.sap.com/2019/08/26/what-is-the-difference-calcview-versus-sql-view/
The difference from the execution plan point of view is getting less and less. In Hana 2.0 SP4 most graphical calc views are turned internally into a single SQL statement to be executed by the SQL engine. So in that sense, using a CalcView gives you the additional information about the model plus the query performance of the SQL engine.
Lars' explanation of CDS is perfect. Nothing to add there.

But Imagine the situation when you can't create a table function because of limited license (aka runtime version). Just stay with scripted views.
The main advantage of Hana artifacts over CDS at present is the ability to use input parameters in complex cases to optimize resources and query performance - when your logic is pushed down into DB instead of AS / app. But many native SQL features are still not available in graphical views (for example - exists, JOIN on BETWEEN), so I think that 10 years later HANA artifacts will become "very rare".
So learn CDS syntax :)

Always a glad experience reading an article or pov from Lars, on any media (StackOverflow, SAP blog, article, twitter).
I just want to point out that another thing that I miss from the SQL scripting (SP, TF, SF) is the join optimization and SQL propagation that Information View has.
This is for me the focus to flexible models (apart from dynamic join that is only relevant for certain scenarios), to deliver one view that will perform depending on which columns the user or app will request.
For the semantics use, I can simply expose a TF inside an information view to add some.
You can tell me that CDS have both options available (join optimization, SQL propagation, and annotation) but for advanced or complicated scenarios (window functions not present at CDS), and also for non-SAP developers, it will be more simple and the go-to approach for beginners

Related

Repository Pattern Contestation

According to Martin Fowler:
... "Client objects construct query
specifications declaratively and
submit them to Repository for
satisfaction" ...
Why? What are the advantages at that point?
I see one disadvantage: database queries are spread and hidden over ties. That makes it harder to debug.
The advantage is that the "what" (the declarative specification) is separated from the "how" or implemenation details. So the client doesn't need to know whether it's querying a relational database, a Web service, an object database (eg Mongo), an XML data store, etc.
Let's assume you're using an RDBMS. Even so, the client is isolated from needing to know whether the database is Oracle, MS SQL, SQLite, mySQL, PostGres, etc. This will save you a lot of headache when the commandment "thou shalt (not) use MS SQL" (or whatever) comes down from the mountain.
The additional layer does introduce some overhead. But (1) ORM tools like (N)Hibernate are quite good at optimizing the generated queries for whatever back-end you're using, and (2) the overhead is generally negligible compared to the cost of database read, let alone a web service call.
We're converting from LINQ to NHibernate right now to avoid the "N+1" problem (ie you generate one query/hit for each "master" database record, plus a query/hit for each "child" record).
And BTW ... there is such a thing as LINQ to NHibernate.

rewriting a SQL/vb6 app - should I use nHibernate or Linq

I have a legacy VB6 app which I am rewriting in .Net. I have not used an ORM package before (being the old fashioned type who likes to know what SQL is being used), but I have seen good reports of NNibernate and I am tempted to use it for this project. I just want to check I won't be shooting myself in the foot.
Because my new app will initially run alongside the existing one, any ORM I use must work with the existing database schema. Also, I need to use SQL server text searching. From what I gather, LINQ to SQL does not support Text searching, so this will rule it out.
The app uses it's own method of allocating IDs for new objects - will NHibernate allow this or does it expect to use it's own mechanisms?
Also I have read that NHibernate does caching. I need to make sure that rows inserted outside of NHibernate are immediately accessible when accessing the database from NHibernate and vice versa.
There are 4 or 5 main tables and 10 or so subsidiary tables. although a couple of the main tables have up to a million rows, the app itself will normally be only returning a few. The user load is low so I don't anticipate performance being a problem.
At the moment I'm not sure whether it will be ASP.NET or win forms but either way I will be expecting to use data binding.
In terms of functionality, The app is not particulatly complicated - the budget to re-implement it is about 20 man days, so if I am going to use ORM it has to be something that will start paying for itself pretty quickly. Similarly I want the app to be simple to deploy and not require some monster enterprise framework.
Any thoughts on whether this is a suitable project for NHibernate would be much appreciated.
While ORMs are good, I personally wouldn't take on the risk of trying to use any ORM on a 20 day project if I had to absorb the ORM learning curve as I went.
If you have ADO.NET infrastructure you are comfortable with and you can live without ORM features, that is the much less risky approach to take.
You should learn ORMs and Linq (not necessarily Linq To Sql) eventually, but it's much more enjoyable when there is no immediate time pressure.
This sounds more like a risk management issue and that requires you to make a personal decision about how willing you are to see the project fail due to embracing new (to you) technologies.
You might also check out LLBL Gen Pro. It is a very mature ORM that handles a lot of different scenarios.
I have successfully fitted an NHibernate domain model to a few legacy database schemas - it's not yet proved impossible, but it is sometimes not without its difficulties. The easiest schemas to map are those where all primary keys and foreign keys are single column ones, but with so few tables you should be able to do the mapping relatively quickly even if this is not true of yours.
I strongly recommend, particularly given your timescale, that you use Fluent NHibernate to do your mappings - the time to learn the XML mapping file syntax may be too big an ask. However, you will need to use an XML mapping file for your full-text indexing stuff (assuming that's what you meant), writing these as named SQL queries. (See nhibernate.info documentation for details.)
Suggest you spend a day or two trying to create a model for a couple of your tables, and writing code to interact with them. There'll always be people on SO ready to answer any questions you have.
You may also want to take a look at Linq to NHibernate - we've found it helpful in terms of abstracting even more of our database access stuff away behind a simple interface. But it's Fluent NHibernate that will give you the biggest and quickest win in terms of "cheating" on the NHibernate learning curve.

ORM for Oracle pl/sql

I am developing a enterprise software for a big company using Oracle. Major processing unit is planned to be developed in PL/SQL. I am wondered if there is any ORM like Hibernate for Java, but the one for PL/SQL. I have some ideas how to make such a framework using PL/SQL and Oracle system tables, but it is interesting - why no one have done this before? What do you think will that be effective in speed and memory consumption? Why?
ORMs exist to provide an interface between a database-agnostic language like Java and a DBMS like Oracle. PL/SQL in contrast knows the Oracle DBMS intimately and is designed to work with it (and a lot more efficiently than Java + ORM can). So an ORM between PL/SQL and the Oracle DBMS would be both superfluous and unhelpful!
Take a read through these two articles - they contain some interesting points
Ask Tom - Relational VS Object Oriented Database Design
Ask Tom - Object relational impedance mismatch
As Tony pointed out ORMs really serve as helper between the App and Db context boundaries.
If you are looking for an additional level of abstraction at the database layer you might want to look into table encapsulation. This was a big trend back in the early 2000s. If you search you will find a ton of whitepapers on this subject.
Plsqlintgen still seems to be around at http://sourceforge.net/projects/plsqlintgen/
This answer has some relevant thoughts on the pros and cons of wrapping your tables in pl/sql TAPIs (Table APIs) for CRUD operations.
Understanding the differences between Table and Transaction API's
There was also a good panel discussion on this at last years UK Oracle User Group - the overall conclusion was against using table APIs and for transaction APIs, for much the same reason - the strength of pl/sql is the procedural control of SQL statements, while TAPIs push you away from writing set-based SQL operations and towards row-by-row processing.
The argument for TAPI is where you may want to enforce some kind of access policy, but Oracle offers a lot of other ways to do this (fine-grained access control, constraints, triggers on insert/update/etc can be used to populate defaults and enforce that the calling code is passing a valid request).
I would definitely advise against wrapping tables in PL/SQL object types.
A lot of the productivity with pl/sql comes from the fact that you can easily define things in terms of the underlying database structure - a row record type can be simply defined as %ROWTYPE, and will be automatically impacted when the table structure changes.
myRec myTable%ROWTYPE
INSERT INTO table VALUES myRec;
This also applies to collections based over these types, and there are powerful bulk operations that can be used to fetch & insert whole collections.
On the other hand, object types must be explicitly impacted each time you want to change them - every table change would require the object type to be impacted and released, doubling your work.
It can also be difficult to release changes if you are using inheritance and collections of types (you can 'replace' a package, but cannot replace a type once it is used by another type).
This isn't putting OO PL/SQL down - there are places where it definitely simplifies code (i.e. avoiding code duplication, anywhere you would clearly benefit from polymorphism) - but it is best to understand and play to the strengths of the language, and the main strength is that the language is tightly-coupled to the underlying DB.
That said, I do often find myself creating procedures to construct a default record, insert a record, etc - often enough to have editor macros for it - but I've never found a good argument for automatically generating this code for all tables (a good way to create a lot of unused code??)
Oracle is a Relation database and also has the ability to work as an object-oriented database as well. It does this by building an abstraction layer (fairly automatically) on top of the relational structure. This would seemingly eliminate the need for any "tool" as it is already built-in.

What are the advantages of LINQ to SQL?

I've just started using LINQ to SQL on a mid-sized project, and would like to increase my understanding of what advantages L2S offers.
One disadvantage I see is that it adds another layer of code, and my understanding is that it has slower performance than using stored procedures and ADO.Net. It also seems that debugging could be a challenge, especially for more complex queries, and that these might end up being moved to a stored proc anyway.
I've always wanted a way to write queries in a better development environment, are L2S queries the solution I've been looking for? Or have we just created another layer on top of SQL, and now have twice as much to worry about?
Advantages L2S offers:
No magic strings, like you have in SQL queries
Intellisense
Compile check when database changes
Faster development
Unit of work pattern (context)
Auto-generated domain objects that are usable small projects
Lazy loading.
Learning to write linq queries/lambdas is a must learn for .NET developers.
Regarding performance:
Most likely the performance is not going to be a problem in most solutions. To pre-optimize is an anti-pattern. If you later see that some areas of the application are to slow, you can analyze these parts, and in some cases even swap some linq queries with stored procedures or ADO.NET.
In many cases the lazy loading feature can speed up performance, or at least simplify the code a lot.
Regarding debuging:
In my opinion debuging Linq2Sql is much easier than both stored procedures and ADO.NET. I recommend that you take a look at Linq2Sql Debug Visualizer, which enables you to see the query, and even trigger an execute to see the result when debugging.
You can also configure the context to write all sql queries to the console window, more information here
Regarding another layer:
Linq2Sql can be seen as another layer, but it is a purely data access layer. Stored procedures is also another layer of code, and I have seen many cases where part of the business logic has been implemented into stored procedures. This is much worse in my opinion because you are then splitting the business layer into two places, and it will be harder for developers to get a clear view of the business domain.
Just a few quick thoughts.
LINQ in general
Query in-memory collections and out-of-process data stores with the same syntax and operators
A declarative style works very well for queries - it's easier to both read and write in very many cases
Neat language integration allows new providers (both in and out of process) to be written and take advantage of the same query expression syntax
LINQ to SQL (or other database LINQ)
Writing queries where you need them rather than as stored procs makes development a lot faster: there are far fewer steps involved just to get the data you want
Far fewer strings (stored procs, parameter names or just plain SQL) involved where typos can be irritating; the other side of this coin is that you get Intellisense for your query
Unless you're going to work with the "raw" data from ADO.NET, you're going to have an object model somewhere anyway. Why not let LINQ to SQL handle it for you? I rather like being able to just do a query and get back the objects, ready to use.
I'd expect the performance to be fine - and where it isn't, you can tune it yourself or fall back to straight SQL. Using an ORM certainly doesn't remove the need for creating the right indexes etc, and you should usually check the SQL being generated for non-trivial queries.
It's not a panacea by any means, but I vastly prefer it to either making SQL queries directly or using stored procs.
I must say they are what you have been looking for. It takes some time getting used to it, but once you do you can't think of going back (at least for me).
Regarding linq vs. stored procedures, you can have poor performance on either if you build it wrong. I moved to linq to sql some stored procedures of a client that were awfully coded, so the time dropped from 20secs (totally unaceptable for a web app) to < 1 sec. And much much less code then the stored procedure solution.
Update 1: Also you get a lot of flexibility, as you can limit the columns of what you are getting and it will actually only retrieve that. On the stored procedure solution you have to define a procedure for each column set you are getting, even if the underlying queries are the same.
Just as an update, here are some links on the future of LINQ to SQL:
What is the Future of Linq to SQL
Has Microsoft confirmed their stance on LINQ to SQL end-of-life?
Is LINQ to SQL Dead or Alive?
As a comment in the last link states, LINQ to SQL isn't going to go away, just not "improved upon" at least by Microsoft. Take these comments and posts as you will, just be cautious in your development plans.
We switched over to LINQ2Entity over the Entity Framework environment recently. Before, we had basic SQLadapters. Since the database we are working with is rather small, I cannot comment on the performance of LINQ.
I must admit though, writing queries have become a lot easier, and the addition of Entities, does enable strong typing.

LINQ-to-SQL vs stored procedures? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I took a look at the "Beginner's Guide to LINQ" post here on StackOverflow (Beginners Guide to LINQ), but had a follow-up question:
We're about to ramp up a new project where nearly all of our database op's will be fairly simple data retrievals (there's another segment of the project which already writes the data). Most of our other projects up to this point make use of stored procedures for such things. However, I'd like to leverage LINQ-to-SQL if it makes more sense.
So, the question is this: For simple data retrievals, which approach is better, LINQ-to-SQL or stored procs? Any specific pro's or con's?
Thanks.
Some advantages of LINQ over sprocs:
Type safety: I think we all understand this.
Abstraction: This is especially true with LINQ-to-Entities. This abstraction also allows the framework to add additional improvements that you can easily take advantage of. PLINQ is an example of adding multi-threading support to LINQ. Code changes are minimal to add this support. It would be MUCH harder to do this data access code that simply calls sprocs.
Debugging support: I can use any .NET debugger to debug the queries. With sprocs, you cannot easily debug the SQL and that experience is largely tied to your database vendor (MS SQL Server provides a query analyzer, but often that isn't enough).
Vendor agnostic: LINQ works with lots of databases and the number of supported databases will only increase. Sprocs are not always portable between databases, either because of varying syntax or feature support (if the database supports sprocs at all).
Deployment: Others have mentioned this already, but it's easier to deploy a single assembly than to deploy a set of sprocs. This also ties in with #4.
Easier: You don't have to learn T-SQL to do data access, nor do you have to learn the data access API (e.g. ADO.NET) necessary for calling the sprocs. This is related to #3 and #4.
Some disadvantages of LINQ vs sprocs:
Network traffic: sprocs need only serialize sproc-name and argument data over the wire while LINQ sends the entire query. This can get really bad if the queries are very complex. However, LINQ's abstraction allows Microsoft to improve this over time.
Less flexible: Sprocs can take full advantage of a database's featureset. LINQ tends to be more generic in it's support. This is common in any kind of language abstraction (e.g. C# vs assembler).
Recompiling: If you need to make changes to the way you do data access, you need to recompile, version, and redeploy your assembly. Sprocs can sometimes allow a DBA to tune the data access routine without a need to redeploy anything.
Security and manageability are something that people argue about too.
Security: For example, you can protect your sensitive data by restricting access to the tables directly, and put ACLs on the sprocs. With LINQ, however, you can still restrict direct access to tables and instead put ACLs on updatable table views to achieve a similar end (assuming your database supports updatable views).
Manageability: Using views also gives you the advantage of shielding your application non-breaking from schema changes (like table normalization). You can update the view without requiring your data access code to change.
I used to be a big sproc guy, but I'm starting to lean towards LINQ as a better alternative in general. If there are some areas where sprocs are clearly better, then I'll probably still write a sproc but access it using LINQ. :)
I am generally a proponent of putting everything in stored procedures, for all of the reasons DBAs have been harping on for years. In the case of Linq, it is true that there will be no performance difference with simple CRUD queries.
But keep a few things in mind when making this decision: using any ORM couples you tightly to your data model. A DBA has no freedom to make changes to the data model without forcing you to change your compiled code. With stored procedures, you can hide these sorts of changes to an extent, since the parameter list and results set(s) returned from a procedure represent its contract, and the innards can be changed around, just so long as that contract is still met.
And also, if Linq is used for more complex queries, tuning the database becomes a much more difficult task. When a stored procedure is running slow, the DBA can totally focus on the code in isolation, and has lots of options, just so that contract is still satisfied when he/she is done.
I have seen many, many cases where serious problems in an application were addressed by changes to the schema and code in stored procedures without any change to deployed, compiled code.
Perhaps a hybird approach would be nice with Linq? Linq can, of course, be used to call stored procedures.
Linq to Sql.
Sql server will cache the query plans, so there's no performance gain for sprocs.
Your linq statements, on the other hand, will be logically part of and tested with your application. Sprocs are always a bit separated and are harder to maintain and test.
If I was working on a new application from scratch right now I would just use Linq, no sprocs.
For basic data retrieval I would be going for Linq without hesitation.
Since moving to Linq I've found the following advantages:
Debugging my DAL has never been easier.
Compile time safety when your schema changes is priceless.
Deployment is easier because everything is compiled into DLL's. No more managing deployment scripts.
Because Linq can support querying anything that implements the IQueryable interface, you will be able to use the same syntax to query XML, Objects and any other datasource without having to learn a new syntax
LINQ will bloat the procedure cache
If an application is using LINQ to SQL and the queries involve the use of strings that can be highly variable in length, the SQL Server procedure cache will become bloated with one version of the query for every possible string length. For example, consider the following very simple queries created against the Person.AddressTypes table in the AdventureWorks2008 database:
var p =
from n in x.AddressTypes
where n.Name == "Billing"
select n;
var p =
from n in x.AddressTypes
where n.Name == "Main Office"
select n;
If both of these queries are run, we will see two entries in the SQL Server procedure cache: One bound with an NVARCHAR(7), and the other with an NVARCHAR(11). Now imagine if there were hundreds or thousands of different input strings, all with different lengths. The procedure cache would become unnecessarily filled with all sorts of different plans for the exact same query.
More here: https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=363290
I think the pro LINQ argument seems to be coming from people who don't have a history with database development (in general).
Especially if using a product like VS DB Pro or Team Suite, many of the arguments made here do not apply, for instance:
Harder to maintain and Test:
VS provides full syntax checking, style checking, referential and constraint checking and more. It also provide full unit testing capabilities and refactoring tools.
LINQ makes true unit testing impossible as (in my mind) it fails the ACID test.
Debugging is easier in LINQ:
Why? VS allows full step-in from managed code and regular debugging of SPs.
Compiled into a single DLL rather than deployment scripts:
Once again, VS comes to the rescue where it can build and deploy full databases or make data-safe incremental changes.
Don't have to learn TSQL with LINQ:
No you don't, but you have to learn LINQ - where's the benefit?
I really don't see this as being a benefit. Being able to change something in isolation might sound good in theory, but just because the changes fulfil a contract doesn't mean it's returning the correct results. To be able to determine what the correct results are you need context and you get that context from the calling code.
Um, loosely coupled apps are the ultimate goal of all good programmers as they really do increase flexibility. Being able to change things in isolation is fantastic, and it is your unit tests that will ensure it is still returning appropriate results.
Before you all get upset, I think LINQ has its place and has a grand future. But for complex, data-intensive applications I do not think it is ready to take the place of stored procedures. This was a view I had echoed by an MVP at TechEd this year (they will remain nameless).
EDIT: The LINQ to SQL Stored Procedure side of things is something I still need to read more on - depending on what I find I may alter my above diatribe ;)
LINQ is new and has its place. LINQ is not invented to replace stored procedure.
Here I will focus on some performance myths & CONS, just for "LINQ to SQL", of course I might be totally wrong ;-)
(1)People say LINQ statment can "cache" in SQL server, so it doesn't lose performance. Partially true. "LINQ to SQL" actually is the runtime translating LINQ syntax to TSQL statment. So from the performance perspective,a hard coded ADO.NET SQL statement has no difference than LINQ.
(2)Given an example, a customer service UI has a "account transfer" function. this function itself might update 10 DB tables and return some messages in one shot. With LINQ, you have to build a set of statements and send them as one batch to SQL server. the performance of this translated LINQ->TSQL batch can hardly match stored procedure. Reason? because you can tweak the smallest unit of the statement in Stored procedue by using the built-in SQL profiler and execution plan tool, you can not do this in LINQ.
The point is, when talking single DB table and small set of data CRUD, LINQ is as fast as SP. But for much more complicated logic, stored procedure is more performance tweakable.
(3)"LINQ to SQL" easily makes newbies to introduce performance hogs. Any senior TSQL guy can tell you when not to use CURSOR (Basically you should not use CURSOR in TSQL in most cases). With LINQ and the charming "foreach" loop with query, It's so easy for a newbie to write such code:
foreach(Customer c in query)
{
c.Country = "Wonder Land";
}
ctx.SubmitChanges();
You can see this easy decent code is so attractive. But under the hood, .NET runtime just translate this to an update batch. If there are only 500 lines, this is 500 line TSQL batch; If there are million lines, this is a hit. Of course, experienced user won't use this way to do this job, but the point is, it's so easy to fall in this way.
The best code is no code, and with stored procedures you have to write at least some code in the database and code in the application to call it , whereas with LINQ to SQL or LINQ to Entities, you don't have to write any additional code beyond any other LINQ query aside from instantiating a context object.
LINQ definitely has its place in application-specific databases and in small businesses.
But in a large enterprise, where central databases serve as a hub of common data for many applications, we need abstraction. We need to centrally manage security and show access histories. We need to be able to do impact analysis: if I make a small change to the data model to serve a new business need, what queries need to be changed and what applications need to be re-tested? Views and Stored Procedures give me that. If LINQ can do all that, and make our programmers more productive, I'll welcome it -- does anyone have experience using it in this kind of environment?
A DBA has no freedom to make changes
to the data model without forcing you
to change your compiled code. With
stored procedures, you can hide these
sorts of changes to an extent, since
the parameter list and results set(s)
returned from a procedure represent
its contract, and the innards can be
changed around, just so long as that
contract is still met.
I really don't see this as being a benefit. Being able to change something in isolation might sound good in theory, but just because the changes fulfil a contract doesn't mean it's returning the correct results. To be able to determine what the correct results are you need context and you get that context from the calling code.
I think you need to go with procs for anything real.
A) Writing all your logic in linq means your database is less useful because only your application can consume it.
B) I'm not convinced that object modelling is better than relational modelling anyway.
C) Testing and developing a stored procedure in SQL is a hell of a lot faster than a compile edit cycle in any Visual Studio environment. You just edit, F5 and hit select and you are off to the races.
D) It's easier to manage and deploy stored procedures than assemblies.. you just put the file on the server, and press F5...
E) Linq to sql still writes crappy code at times when you don't expect it.
Honestly, I think the ultimate thing would be for MS to augment t-sql so that it can do a join projection impliclitly the way linq does. t-sql should know if you wanted to do order.lineitems.part, for example.
LINQ doesn't prohibit the use of stored procedures. I've used mixed mode with LINQ-SQL and LINQ-storedproc. Personally, I'm glad I don't have to write the stored procs....pwet-tu.
IMHO, RAD = LINQ, RUP = Stored Procs. I worked for a large Fortune 500 company for many years, at many levels including management, and frankly, I would never hire RUP developers to do RAD development. They are so siloed that they very limited knowledge of what to do at other levels of the process. With a siloed environment, it makes sense to give DBAs control over the data through very specific entry points, because others frankly don't know the best ways to accomplish data management.
But large enterprises move painfully slow in the development arena, and this is extremely costly. There are times when you need to move faster to save both time and money, and LINQ provides that and more in spades.
Sometimes I think that DBAs are biased against LINQ because they feel it threatens their job security. But that's the nature of the beast, ladies and gentlemen.
According to gurus, I define LINQ as motorcycle and SP as car.
If you want to go for a short trip and only have small passengers(in this case 2), go gracefully with LINQ.
But if you want to go for a journey and have large band, i think you should choose SP.
As a conclusion, choosing between motorcycle or car is depend on your route (business), length (time), and passengers (data).
Hope it helps, I may be wrong. :D
Also, there is the issue of possible 2.0 rollback. Trust me it has happened to me a couple of times so I am sure it has happened to others.
I also agree that abstraction is the best. Along with the fact, the original purpose of an ORM is to make RDBMS match up nicely to the OO concepts. However, if everything worked fine before LINQ by having to deviate a bit from OO concepts then screw 'em. Concepts and reality don't always fit well together. There is no room for militant zealots in IT.
I'm assuming you mean Linq To Sql
For any CRUD command it's easy to profile the performance of a stored procedure vs. any technology. In this case any difference between the two will be negligible. Try profiling for a 5 (simple types) field object over 100,000 select queries to find out if there's a real difference.
On the other hand the real deal-breaker will be the question on whether you feel comfortable putting your business logic on your database or not, which is an argument against stored procedures.
All these answers leaning towards LINQ are mainly talking about EASE of DEVELOPMENT which is more or less connected to poor quality of coding or laziness in coding. I am like that only.
Some advantages or Linq, I read here as , easy to test, easy to debug etc, but these are no where connected to Final output or end user. This is always going cause the trouble the end user on performance. Whats the point loading many things in memory and then applying filters on in using LINQ?
Again TypeSafety, is caution that "we are careful to avoid wrong typecasting" which again poor quality we are trying to improve by using linq. Even in that case, if anything in database changes, e.g. size of String column, then linq needs to be re-compiled and would not be typesafe without that .. I tried.
Although, we found is good, sweet, interesting etc while working with LINQ, it has shear disadvantage of making developer lazy :) and it is proved 1000 times that it is bad (may be worst) on performance compared to Stored Procs.
Stop being lazy. I am trying hard. :)
For simple CRUD operations with a single data access point, I would say go for LINQ if you feel comfortable with the syntax. For more complicated logic I think sprocs are more efficiant performance-wise if you are good at T-SQL and its more advanced operations. You also have the help from Tuning Advisor, SQL Server Profiler, debugging your queries from SSMS etc.
The outcome can be summarized as
LinqToSql for small sites, and prototypes. It really saves time for Prototyping.
Sps : Universal. I can fine tune my queries and always check ActualExecutionPlan / EstimatedExecutionPlan.
Create PROCEDURE userInfoProcedure
-- Add the parameters for the stored procedure here
#FirstName varchar,
#LastName varchar
AS
BEGIN
SET NOCOUNT ON;
-- Insert statements for procedure here
SELECT FirstName , LastName,Age from UserInfo where FirstName=#FirstName
and LastName=#FirstName
END
GO
http://www.totaldotnet.com/Article/ShowArticle121_StoreProcBasic.aspx
Stored procedure makes testing easier and you can change the query without touching the application code. Also with linq, getting a data does not mean its the right data. And testing the correctness of the data means running the application but with stored procedure it's easy to test without touching the application.
Both LINQ and SQL have their places. Both have their disadvantages and advantages.
Sometimes for complex data retrieval you might need stored procs. And sometimes you may want other people to use your stored proc in Sql Server Management Studio.
Linq to Entities is great for fast CRUD development.
Sure you can build an app using only one or the other. Or you can mix it up. It all comes down to your requirements. But SQL stored procs will no go away any time soon.

Resources