Managing LINQ to SQL .dbml model complexity - linq

This question is addressed to a degree in this question on LINQ to SQL .dbml best practices, but I am not sure how to add to a question.
One of our applications uses LINQ to SQL and we have currently have one .dbml file for the entire database which is becoming difficult to manage. We are looking at refactoring it a bit into separate files that are more module/functionality specific, but one problem is that many of the high level classes would have to be duplicated in several .dbml files as the associations can't be used across .dbml files (as far as I know), with the additional partial class code as well.
Has anyone grappled with this problem and what recommendations would you make?

Take advantage of the namespace settings. You can get to it in properties from clicking in the white space of the ORM.
This allows me to have a Users table and a User class for one set of business rules and a second (but the same data store) Users table and a User class for another set of business rules.
Or, break up the library, which should also have the affect of changing the namespacing depending on your company's naming conventions. I've never worked on an enterprise app where I needed access to every single table.

Past a certain size it probably becomes easier to work with the xml instead of the dbml designer.

I have written a tool too! Mine is for scripting changes to dbml files using c# so you can rerun them and not lose changes. See my blog http://www.adverseconditionals.com 4 more details

The approach that we've used it to keep 2 .dbml files. One of them holds the Stored Procs and all production DB access is done through this. The other is in a unit test folder and holds tables and their relationships and is used for DB data manipulation and querying for unit tests.

I have written a utility to address exactly that problem, I needed a quick app to let you select only the database objects you need. In my case I often needed a complex view, but no tables.
http://www.codeplex.com/SqlMetalInclude/

Related

How to handle multiple customers with different SQL databases

Summary
I have a project with multiple existing MSSQL databases, I already created an Azure Analysis Service where I deployed my first Tabular Cube. I already tested to access the Analysis Service, worked perfectly.
Finally I have to duplicate the above described for ~90 databases (90 different customers).
I'm unsure how to organize this project and I'm not sure about the possibilities I have.
What I did
I already browsed the Internet to find some information, but I just found a single source where somebody asked a similar question, the first reply is what I was already thinking about, as I described below.
The last reply I don't really understand, what does he mean with one solution, is there another hierarchy above the project?
Question
A possibility would be to import each database as a source in the same project, but I think this means I have to import each table from this source, means finally 5*90 = 450 tables, I think this gets quickly outta control?
Also I thought about duplicating the whole Visual Studio Project folder for ~90 times for each customer, but at the moment I fail to find all references to change the name, but I think this wouldn't be to hard.
Is there an easier way to achieve my goal? Especially regarding maintainability.
Solution
I will make a completely new Database with all the needed tables. Inside those tables I copy the databases from all customers with a new column customerId. The data I'll transfer with a cyclic job, periodicity to define. Updates in already existing row in the customer database I handle with a trigger.
For this the best approach would be to create a staging database and import the data from the other databases, so your Tabular Model can read the data from it.
Doing 90+ databases is going to be a massive admin overhead and getting the cube to lad them effectively is going to be problematic. Move the data using SSIS/Data factory as you'll be able to better orchestrate the data movement, and incremental loads that way. That way if you need to add/remove/update data sources it is not done in the Cube, its all done at the database/data factory level.
Just use one database for all the customers and differentiate each customer with a customer_id column.

Mapping View to Entity using EF 5 Code First

I have developed a quite nice web-app using EF 5 and code first. But while running benchmarks I found that the performance was not as good as I wanted... looking further I kinda figured out that all the queries that EF generates are similar to Select * From and that is not best practise.
Reading this answer here Select Specific Columns from Database using EF Code First I understood that I could generate a view and map it to a entity. My question is how do I map a view to a entity or vice-versa using EF 5 code first?
The reason I'm asking this is: I have a very wide table on which I perform "preliminar search" search items by name and then go back for the rest of it on one case... in another I have a big table and most of the time I only use the Title and Description and not the LOB column... in all thouse cases Im getting something from the database Im not using...
So if I could indeed map a view to a entity or vice-versa I could save alot of bandwith between backbone and application tier...
It's not the same thing you're talking about - i.e. not an exact answer - but it's addressing performance, via what EF calls 'views'.
I'd suggest you try out the EF Power Tools - and 'Generate Views'.
By running that - the 'views' file is added to the project - which is a .cs one - and that enhances the core EF performance (this is an EF feature, not the code-first - but with power-tools we can now use it with code-first as well).
It doesn't add the 'Db views' - but as far as I can tell - it works by pre-analyzing and code-generating the SQL templates.
"Before the Entity Framework can execute a query against a conceptual
model or save changes to the data source, it must generate a set of
local query views to access the database. The views are part of the
metadata which is cached per application domain. If you create
multiple object context instances in the same application domain, they
will reuse views from the cached metadata rather than regenerating
them. Because view generation is a significant part of the overall
cost of executing a single query, the Entity Framework enables you to
pre-generate these views and include them in the compiled project. For
more information, see Performance Considerations (Entity Framework)."
http://msdn.microsoft.com/en-us/library/bb896240.aspx
I could 'feel' a boost in performance.
Notes:
There are couple issues with it - and you might get some exceptions running it the first time:
Make sure your class is the only context in the file (it takes the first one),
I had to move the project out of a 'solution dir' (that is a trick I learned from power-shell console - which required the same)
Also, any other attempts to manually 'tweak' the Db with the 'real' views - would be futile I think, as it isn't closely integrated w/ the ORM (you need more then one - and matching calls etc.).
the way I achieve that is not very clean but:
I create a type
declare a dbset for the type
drop the table in the db if necessary
create a view named as the dropped table with the same field (type and name).
Of course all that is encapsulated in the seed method.
Not clean but running. I think some trouble is to come if you want to "migrate" the structure of the view. But this way all his nearly as if you get an entity. Of course insertion and update may be touchy, but this is not my purpose.
if you respect the naming convention even the loading strategies are available.

Moving from SQL-query-based approach to Linq

I'm not so good at both Linq and SQL. But I have worked more with SQL and less with LINQ. I've gone through many articles which favors LINQ. I don't want to go the SQL way (i.e. writing stored procedures and operating data etc.)
I want to start with LINQ for every operation related with data. Here are the reasons why I want to do this:
I want to have complete control of my database via application and not by writing stored procs (as I'm not so good at writing store procedure)
I want to create my project as an easy maintainability view
Want faster development
For that, I know that:
I need to add a dbml file, drag and drop tables into that
Use dbContext class, and so on
But I want to know, is there a way:
I can avoid creating dbml file and still be able to access the database?
Do I need to use Linq to Entities for the same?
Will it be a good way to avoid using dbml file? Since for every database changes I need to drop and drop tables every time
Also I've come across many posts where linqToSql is considered deprecated and not a .net future?
I have so many doubts, but I think that's obvious when starting with a new technology?
I found this useful article which is good for beginners:
[http://weblogs.asp.net/scottgu/archive/2010/08/03/using-ef-code-first-with-an-existing-database.aspx][1]
after doing some more research I came to conclusion that:
1)i can avoid creating dbml file and still be able to access database??
ANS Yes but instead of dbml now edmx files will be created.
2)Do I need to use Linq to Entities for the same?
ANS Yes you can go with linq to entities.
3)Will it be good way avoid using dbml file? since for every database changes I need to drop and drop tables every time
ANS it is not required to drop and create again the tables. their are options where you can update selected part of your database and you are not avoiding dbmls. it will created edmx file and that will almost similar to dbmls in many ways.
4) Also I've come across many posts where linqToSql is considered deprecated and not a .net future?
ANS yes in future development it will be depreciated. it supports only sql server as backend.
I hope I'm right. Please do tell me in case any other suggestions.
LINQ is a way to query and project collection of data. For example, you can use LINQ to query and shape data from a database or from an array. LINQ by it self has nothing to with the under lying database.
You use an ORM (Object Relational Mapper) technology to project data stored in tables of a database as collections of objects. Once you have the collection of objects, you can use LINQ to query them.
Now, you have many ORM technologies to select from, such as Entity Framework, NHibernate, Linq2Sql. If you don’t like to maintain a dbml file, have a look at code first approach offered by Entity Framework.
Then there are things called LINQ data providers. They would take a LINQ query, transform it to a SQL targeting a particular database, execute the query and get the results back as a set of objects. Many of the ORMs above has built in LINQ data providers as a part of them and would work behind the scene in fetching the data.
I would advise you to look up on some patterns such Repository and Unit of work for your data layer. When used correctly, these patterns will isolate your data access code from your applications upper layers. This will help you to change your data access technology is it becomes obsolete, without affecting the rest of the application.
LINQ is an awesome technology and you should definitely try it
I have composed the above answer based on my own experience and I am sure there are many SO users with better understanding of the above technologies than myself who may wish to add their own opinion
Good luck

Code first approach versus database first approach

I am working on an asp.net MVC 3 web application and I am using database first, but after I have mapped the DB tables into entity classes using entity framework, I am interacting with these tables as I will be interacting on the code first approach by dealing with Database tables as classes an objects.
So after mapping the tables into entity classes I find that the code first approach and DB first are very similar but except of start writing the entities classes from scratch (as in code first) I have created the entity classes from existing database tables - which is easier and more convenient in my case.
So are there specific cases on which i will not be able to do some functionalities unless i am using one approach over the other which till now i cannot find any?
Having dealt with many many headaches using db-1st EDMX pre EF 4.1, I am partial to code-first. But I'm not going to evangelize it.
In addition to the direct sproc mapping & function import features mentioned in Pawel's answer & comment, you won't be able to change the namespaces or any other code in the generated files when you use db-first. Afaik all of the files are nested under the .tt file. If there is a way to move them into logical folders & namespaces in your project, then I'm not aware of it.
Also if you ever want to separate your DbContext into a separate project from your entities, I recall this was possible pre-EF 4.1. But it was more cumbersome, because you had to run custom tool on both .tt files after each db change. With code-first this is pretty straightforward because you're dealing with pure OOP.
I think that the biggest limitation of CodeFirst (as compared to ModelFirst/DatabaseFirst approaches) is that you cannot map your CUD operations to stored procedures. If you are not planning to do that then you should be good to go.
To be more specific - You can invoke stored procedures using SqlQuery method on DbSet which will cause the returned entities to be tracked or more general SqlQuery and ExecuteSqlCommand on Database class (for Database.SqlQuery the returned objects do not have to be entities and there is no tracking for these objects). That's about it. You cannot map Create/Update/Delete operations to stored procedures. FunctionImports are not supported as well
EDIT
It's possible to map CUD operations to stored procedures in EF6 now

Building Oracle DB; Good Directory Layout

I'm looking for advice on how to best organize a new Oracle schema and dependent files in my project directory - with the sequences, triggers, DDL, etc. I've been using one monolothic file called schema.sql for some time, but I'm wondering if there's a best practice? Something like...
database/
tables/
person.sql
group.sql
sequences/
person.sequence
group.sequence
triggers/
new_person.trigger
Penny for your thoughts or a URL that I may have missed!
Thank you!
Storing DDL by object type is a reasonable approach-- anything is likely to be easier to navigate than a monolithic SQL script. Personally, though, I'd much rather have DDL organized by function. If you're building an accounting system, for example, you probably have a series of objects to manage accounts payable and a separate set of objects to manage accounts receivable along with some core objects for managing the general ledger accounts. That would lead to something along the lines of
database/
general_ledger/
tables/
packages/
sequences/
accounts_receivable/
tables/
packages/
sequences/
accounts_payable/
tables/
packages/
sequences
As the system gets more complex, that hierarchy would naturally get deeper over time. This sort of approach would more naturally mirror the way non-database code is stored in source control. You wouldn't have a single directory of Java classes in a directory structure like
middle_tier/
java/
Foo.java
Bar.java
You would organize the classes that implement the same sorts of business logic together and separate from the classes that implement different bits of business logic.
One item to consider is those SQLs which can act as 'latest only' scripts. These include CREATE OR REPLACE PROCEDURE/FUNCTION/TRIGGER etc. You run the latest version and you are not worried about what may have previously existed in the database.
On the other hand you have tables where you may start off with a CREATE TABLE followed by several ALTER TABLEs as changes to the schema evolve. And if you are doing an upgrade you may want to apply several of the ALTER TABLE scripts (preferably in order).
I'd argue against a 'functional grouping' unless it is really obvious where the lines are drawn. You probably don't want to be in a position where you have a USERS table in one group and a USER_AUTHORITIES in another and an AUTHORITY group in a third.
If you do have decent separation, then they are probably in separate schemas and you do want to keep schemas distinct (since you can have the same object names in different schemas).
The division-by-object-type arrangement, with the addition of a "schema" directory below the database directory works well for me.
I've worked with source control systems that have the additional division-by-function layer - if there are many objects it adds additional searching if you're trying to cross-reference the source control file with the object that you see in a database GUI navigator that generally groups objects by type. It's also not always clear how an object should be classified this way.
Consider adding a "grants" directory for the grants made by that schema to other schemas or roles, with one file per grantee. If you have "rule-based" grants such as "the APPLICATION_USER role always gets SELECT on all of schema X's tables", then write a PL/SQL anonymous block to perform this action. (You might be tempted to reverse-engineer the grants after they get put in place by some ad-hoc method, but it's easy to miss something when new tables or views are added to the application).
Standardize on a delimiter for all scripts and you'll make your life easier if you start deploying through a build utility such as Ant. Using "/" (vs. ";") works for both SQL statements as well as PL/SQL anonymous blocks.
In our projects we use somewhat combined approach: we have a core of our program as a root and other functionalities in subfolders:
root/
plugins/
auth/
mail/
report/
etc.
In all these folders we have both DDL and DML scripts almost all of them can be run more that once, e.g. all packages are defined as create or replace..., all data insertion scripts check whether data already exists and so on. This gives us the opportunity to rus almost all scripts without thinking that we can crash something.
Obviously this scenario can't be applied for create table and similar statements. For these scripts we have manually written small bash script that extracts specified files and runs them not failing on particular ORA errors, like: ORA-00955: name is already used by an existing object.
Also all files are mixed in the directories but differ with extensions: .seq goes for sequence, .tbl goes for table, .pkg goes for package interface, .bdy goes for package body, .trg goes for trigger an so on...
Also we have a naming convention denoting prefixes for all of our files: we can have cl_oper.tbl table with cl_oper.seq and cl_oper.trg sequence and triggers and cl_oper_processing.pkg together with cl_oper_processing.bdy with logic for mentioned objects. With this naming convention in file managers it's very easy to see all the files connected with some unit of logic for our project (whilst the grouping in directories by object types does not provide this).
Hope this information helps you somehow. Please leave comments if you have any questions.

Resources