We have built a database in SQL Server using two patterns we found in Len Silverston's Data Model Resource Book Vol. 3, one for Types and Categories: Classification and another for Hierarchies, Aggregations, and Peer-to-Peer relationships. Essentially, it is implmented as follows:
Classification
[Entity] (M x M) [EntityType] (M x M) [EntityTypeType]
...where the (M x M) is a many-to-many relationship implemented in the database by a Classification table.
Association
The entity and type tables above also have their own typed Rollups:
[Entity] (M x M) [Entity]
[EntityType] (M x M) [EntityType]
[EntityTypeType] (M x M) [EntityTypeType]
...where each (M x M) is a many-to-many relationship implemented in a Rollup table with a foreign key to the type of rollup/association.
The resulting data structure gives us tremendous expressive ability in terms of describing both our entities and their relationships to one another. However, we're having trouble taking advantage of this expressiveness in our applications. The primary issue is that in spite of the advances in EF 4& 5, M-2-M relationships are still tricky and beyond that we're trying to access the M-2-M's in at least 2 directions whenever we hit the database. It is especially complicated by:
We subtype both [Entity] and some subtypes of [Entity].
All the of the M2M tables - all the classification and rollup/association tables - have a payload that contains at least a From and Thru date. The rollups also contain at least a rollup type.
We don't want to have to load large, distant portions of the typing schema (EntityTypeType tables and their roll-ups) in order to interpret the data at runtime every time we access entities.
Technologies:
SQL Server 2008 R2
Entity Framework 5
.NET 4.5
MVC 4 (in the web app portion, we also have some Console Apps)
Questions about the model itself:
Is this simply an unworkable data model in .NET?
Should we first flatten our database into more .NET friendly views that essentially model our business objects?
Questions about the typing scheme - bear in mind that the types are pretty static:
Should we scaffold the [EntityType] and [EntityTypeType] tables, their classifications, and their rollups into C# classes? This would work similar to enum scaffolders, only we need more than a name/int since these have payloads date range and type payloads. If so, what are some ideas for how to scaffold those files - as static classes? Hard-coded object lists?
Should we instead cache the typing scheme at start-up (this bothers me, because it adds a lot of overhead to starting up the Console Apps)?
Any other ideas - scaffolded XML Files? etc...
Any ideas or experiences are much appreciated!
I tried to answer each question, but I have to admit I'm not sure if you are trying to dynamically create entities on top of a database at run-time - or if you're just trying to create entities dynamically before run-time.
If you're trying to release code that dynamically changes/adjusts when the schema in SQL Server is changed, then I would have some different answers. =)
Is this simply an unworkable data model in .NET?
Some things you mentioned that stood out to me:
Lots of M x M relationships.
Entity/EntityType/EntityTypeType
Rollups
Some questions I have after reading:
Did you guys pick a framework for modeling data in the hopes it would make everything easier?
Did you pick a framework because it seemed like the "right" way to do it?
I have a hard time following how you've modeled the data. What is an EntityTypeType exactly?
Are all the M x M relationships really needed? Just because Entity A and Entity B can be in a M x M relationship, should they?
I don't know your domain, but I know I have a hard time following what you've described. =) In my opinion it has already become somewhat unworkable for two reasons: 1) You're asking on SO about it, b) It's not easy to describe without a lot of text.
Should we first flatten our database into more .NET friendly views that essentially model our business objects?
Yes!
At least from my experience I would say yes. =)
I think it's ok to create entities in the database that have complex relationships. Parent/child, peer to peer, hierarchical, etc. All fine and good and normal.
But the application shouldn't have to interpret all of that to make sense of it. The database is good at doing joins, grouping data, creating views, etc. I would advise creating views or stored procedures that get as close to your business objects as possible - at least for the reads.
Also consider that if you push the complexity of relating objects to the application, you might pay some performance penalties.
Should we scaffold the [EntityType] and [EntityTypeType] tables, their classifications, and their rollups into C# classes?
I would advise against this. The only reason is that now you are doing database work in the application layer. If you want to do joins/rollups/etc. you're managing that data in the application - not the database.
I'm sure you already know this, but you want to avoid bringing back everything in the database into the application and then running queries/building objects.
SQL Server is awesome at relating objects and pulling together different entities. It'll be very fast at that layer.
I would create business objects and populate those from SQL.
If you needed to scaffold EntityType/EntityTypeType, just be careful you aren't running into N+1 issues.
What are some ideas for how to scaffold those files - as static classes? Hard-coded object lists?
Options:
Use an ORM to generate the classes for you. You've already seen some of that with EF.
Write your own code generation tool by looking at the SQL Server schemas for your entities.
Write classes by hand that aren't generated. I see this done all the time and it's not as bad as you think. Definitely some grunt work, but also gives flexibility.
Should we instead cache the typing scheme at start-up (this bothers me, because it adds a lot of overhead to starting up the Console Apps)?
You want to generate the business objects based on your entities at application load? Or another way of phrasing the question ... generate proxy classes at runtime?
I am not familiar with Len Silverston's book, and I assume neither is the vast majority of StackOverflow users. Since you didn't get answers to your question within 3 days, I'll now comment on it anyhow.
What strikes me as odd in your database model is that you seem to have multiple nested N:M relationships. There rarely ever are true N:M relationships - most of the time the intermediate table in a N:M relational model actually represents an object that exists in real life, leaving you with two regular 1:N relationships and more than just two columns in the linking table. I've only come across one or two true N:M relations in the last 25 years, so I kind of doubt you've actually got three here - it seems more likely that you're using a pattern that does not apply in the way you think it does.
What you describe smells a bit like you're trying to model a database in a database. Especially the "Should we first flatten our database into more .NET friendly views that essentially model our business objects?" sounds fishy. With very few exceptions, your database tables should match your business objects to begin with. If you need to create a flattened view to see your actual objects, you might be entirely heading in the wrong direction. But then again, its hard to tell from the information you gave.
Please give a less abstract description of what you're trying to do, what you've done so far.
Related
What are the implications (on performance and other aspects) of having Rails Associations between ActiveRecord Models that are not mapped to a Database relationship
I have seen this in a real project and have searched for these implications without any luck
These are some of the documents I have found
http://guides.rubyonrails.org/association_basics.html
http://www.ibm.com/developerworks/library/os-railsn1/
May not be an answer but it would make for a long comment and may add value to the OP.
Correct me if I am wrong but I believe what the OP is getting at is that the code creates the relation but the database does not have a key relationship using foreign and primary keys inside the actual database. Since rails uses Object-Relational (ORM) structure instead of a Database-Relational structure.
Not sure if there are really any draw back implications (although I am sure many would disagree) and the second article seems to focus mostly on n+1 issues which you can resolve by writing appropriate code. Many n+1 issues spring out of code where people do not want to be explicit with their attribute selections and thus rails obliges by returning all attributes of a record. then when you request that objects relationship rails graciously runs another query for you and returns what you've asked for but when you are doing this in large iterations this can cause a large load and create performance issues.
You can avoid these issues by explicitly selecting exactly what you want by using select and group as well as using include or eager_load statements. Yes these are more complicated to write originally as many of them will require more SQL and less railsy syntax but the performance increase can be drastic when querying large tables or multiple relationships.
These are statements you would generally write by hand in stored-procedures or queries in SQL. Rails still gives you these freedoms but many times they are ignored because they look more like SQL than ruby and I have seen more than once people complaining about both the fact that if they change the DB they will have to change this too and that it "looks bad". Well if you were writing this from scratch in a standard database both these issues still apply.
As many have stated "ORM is just a tool" and any tool that is used improperly can have drastic implications. As long as you understand the tools you are using ORM can be very powerful and far more concise but if you ignore features your tool provides expect to have consequences with an extensive reach.
If you use a chainsaw as a hammer expect to lose an arm
UPDATE
I am unaware of any improvements in performance due to relationships inside the database other than it offers a hint as to where to index values. (You can create indexes during rails migrations although the primary ones are created for you the relational ones generally are not). Adding indexing to relational fields will generally speed up the database performance whether or not the specific relationship is actually defined in the database. Someone Asked a Similar Question
Database Relationships are more about Data Integrity than anything else and these issues should be handled inside your models in an ORM design not in the database its self.
Here is another resource to take a look at
And a few more SO questions on the subject
In SQL Server 2008, do relationships make queries faster?
Does Foreign Key improve query performance?
What's wrong with foreign keys?
Is there a severe performance hit for using Foreign Keys in SQL Server?
SQL Server Foreign Key constraint benefits
You get the idea just about asking the right question.
I have an MVC3 NHibernate/ActiveRecord project. The project is going okay, and I'm getting a bit of use out of my model objects (mostly one giant hierarchy of three or four classes).
My application is analytics based; I store hierarchial data, and later slice it up, display it in graphs, etc. so the actual relationship is not that complicated.
So far, I haven't benefited much from ORM; it makes querying easy (ActiveRecord), but I frequently need less information than full objects, and I need to write "hard" queries through complex and multiple selects and iterations over collections -- raw SQL would be much faster and cleaner.
So I'm thinking about ditching ORM in this case, and going back to raw SQL. But I'm not sure how to rearchitect my solution. How should I handle the database tier?
Should I still have one class per model, with static methods to query for objects? Or should I have one class representing the DB?
Should I write my own layer under ActiveRecord (or my own ActiveRecord-like implementation) to keep the existing code more or less sound?
Should I combine ORM methods (like Save/Delete) into my model classes or not?
Should I change my table structure (one table per class with all of the fields)?
Any advice would be appreciated. I'm trying to figure out the best architecture and design to go with.
Many, including myself, think the ActiveRecord pattern is an anti-pattern mainly because it breaks the SRP and doesn't allow POCO objects (tightly coupling your domain to a particular ORM).
In saying that, you can't beat an ORM for simple CRUD stuff, so I would keep some kind of ORM around for that kind of work. Just re-architect your application to use POCO objects and some kind or repository pattern with your ORM implementation specifics in another project.
As for your "hard" queries, I would consider creating one class per view using a tiny ORM (like Dapper, PetaPoco, or Massive), to query the objects with your own raw sql.
I have predefined tables in the database based on which I have to develop a web application.
Should I base my model classes on the structure of data in the tables.
But a problem is that the tables are very poorly defined and there is much redundant data in them (which I can not change!).
Eg. in 2 tables three columns are same.
Table: Student_details
Student_id , Name, AGe, Class ,School
Table :Student_address
Student_id,Name,Age, Street1,Street2,City
I think you should make your models in a way that would be best suited for how they will be used. Don't worry about how the data is stored or where it is stored... otherwise why go through the trouble of layering your code. Why not just do the direct DB query right in your view? So if you are going to create an abstraction of your data... "model" ... make one that is designed around how it will be used... not how it will be or is persisted.
This seems like a risky project - presumably, there's another application somewhere which populates these tables. As the data model is not very sound from a relational point of view, I'm guessing there's a bunch of business/data logic glued into that app - for instance, putting the student age into the StudentAddress table.
I'd support jsobo in recommending you build your business logic independently of the underlying persistance mechanism, and that you try to keep your models as domain focused as possible, without too much emphasis on how the database happens to be structured.
You should, however, plan on spending a certain amount of time translating your domain models into their respective data representations and dealing with whatever quirks the data model imposes. I'd strongly recommend containing all this stuff in a separate translation layer - don't litter it throughout the rest of the application.
When using LINQ to SQL or Entity framework,shall we need to separate application in 3 layers?BLL,DAL,Interface?
Do what works for you. Building a wedding website with a handful of links and getting 5 content pages out of the database? More than 1 layer seems like tremendous overkill. On the flip side, for a very complex or large project: I think you'd want at least some degree separation because it saves time, confusion and sanity.
It matters what you're working on and how much division it requires. Ultimately it's what you and your team prefer. There's no right answer, it's what fits the situation.
in projects I've been developing, I find value in creating a DL even when using Linq2Sql for data access.
My main reason is because many of the calls to the DL, to retreive one or more business objects from the DB, actually require more than one call to the database, especially when implementing an eager-loading strategy. and when saving a business object, whose data is stored in multiple tables, a transaction can be established across multiple calls to the database.
The business layer doesn't need to know that; it should be able to make a single call to the DL and leave it to the DL to do all the tedious querying and collation of data into business objects.
I'm with #MikeJacobs.
I've actually written a LINQ2SQL library which abstracts ALL the DataContext stuff, and all the .Insert(), .Execute() and .SubmitChanges().
It's really nice to just abstract that away. In LINQ2SQL, you're still dependant on all your layers knowing about the LINQ2SQL Entities, but my top layers is very rarely sending complex lambdas to the DAL, most of that's done in the DAL.
I have a legacy VB6 app which I am rewriting in .Net. I have not used an ORM package before (being the old fashioned type who likes to know what SQL is being used), but I have seen good reports of NNibernate and I am tempted to use it for this project. I just want to check I won't be shooting myself in the foot.
Because my new app will initially run alongside the existing one, any ORM I use must work with the existing database schema. Also, I need to use SQL server text searching. From what I gather, LINQ to SQL does not support Text searching, so this will rule it out.
The app uses it's own method of allocating IDs for new objects - will NHibernate allow this or does it expect to use it's own mechanisms?
Also I have read that NHibernate does caching. I need to make sure that rows inserted outside of NHibernate are immediately accessible when accessing the database from NHibernate and vice versa.
There are 4 or 5 main tables and 10 or so subsidiary tables. although a couple of the main tables have up to a million rows, the app itself will normally be only returning a few. The user load is low so I don't anticipate performance being a problem.
At the moment I'm not sure whether it will be ASP.NET or win forms but either way I will be expecting to use data binding.
In terms of functionality, The app is not particulatly complicated - the budget to re-implement it is about 20 man days, so if I am going to use ORM it has to be something that will start paying for itself pretty quickly. Similarly I want the app to be simple to deploy and not require some monster enterprise framework.
Any thoughts on whether this is a suitable project for NHibernate would be much appreciated.
While ORMs are good, I personally wouldn't take on the risk of trying to use any ORM on a 20 day project if I had to absorb the ORM learning curve as I went.
If you have ADO.NET infrastructure you are comfortable with and you can live without ORM features, that is the much less risky approach to take.
You should learn ORMs and Linq (not necessarily Linq To Sql) eventually, but it's much more enjoyable when there is no immediate time pressure.
This sounds more like a risk management issue and that requires you to make a personal decision about how willing you are to see the project fail due to embracing new (to you) technologies.
You might also check out LLBL Gen Pro. It is a very mature ORM that handles a lot of different scenarios.
I have successfully fitted an NHibernate domain model to a few legacy database schemas - it's not yet proved impossible, but it is sometimes not without its difficulties. The easiest schemas to map are those where all primary keys and foreign keys are single column ones, but with so few tables you should be able to do the mapping relatively quickly even if this is not true of yours.
I strongly recommend, particularly given your timescale, that you use Fluent NHibernate to do your mappings - the time to learn the XML mapping file syntax may be too big an ask. However, you will need to use an XML mapping file for your full-text indexing stuff (assuming that's what you meant), writing these as named SQL queries. (See nhibernate.info documentation for details.)
Suggest you spend a day or two trying to create a model for a couple of your tables, and writing code to interact with them. There'll always be people on SO ready to answer any questions you have.
You may also want to take a look at Linq to NHibernate - we've found it helpful in terms of abstracting even more of our database access stuff away behind a simple interface. But it's Fluent NHibernate that will give you the biggest and quickest win in terms of "cheating" on the NHibernate learning curve.