Switch from Entity Framework Database First to Code First - visual-studio

Our Solution is currently based on Entity Framework Database First. We have a T4 Template that generates repository classes from the EDMX.
We are reviewing our planned approach for releasing changes, especially Database changes. If we continue with Database first, then we will need to separately generate scripts to change the development and other databases.
It seems that with Code First, we simply change the model and that generates scripts to change the various databases. This seems more straightforward, does not involve hand crafting scripting processes and lower risk.
So, if we make the switch, is it simply a case of:
Moving the previously generated models from EDMX in our Entities
Project to (they're all currently in one Class File) to (preferably
separate) Class Files in a folder within the Entities Project
Adjust the T4 Template to pick up the models from their new location
No longer using the EDMX and Update from Database
When we want to make a change to the model, simply changing the (previously but no longer generated) classes
Using Code First Migrations to implement changes to the Test and
other databases
Finally how would we see the relationships between the models? Is there a way of creating the diagram?
Thanks,
Chris

This is something I have done in the past. Over time I have tried many different methods. Currently I am using EF Reverse Poco Generator to allow changes to be made to the database and reflected in code and to initially generate the Model/Poco classes. I can also make changes to the existing classes manually. Then I generate migrations for each change.
The code-first model allows for the defining of navigation properties the same as you have now, so you can see the relationships through code and tools for visualizing class relationships.
If you want to see the database structure you can use the 'Database Diagrams' feature in MS Sql if that is what you are using. It is my impression that you are encouraged to use tools other than entity framework itself for visualizing database or class relationships. This allows the EF team to focus on Database code instead of complex UI integration with VS.
I personally I depend on Database Diagrams to check my class structure and the DB they output, but I find it natural to just look at the Poco classes. I haven't found any exceptional class diagramming tools.
All that being said, you are correct in your statements. Although I would start from the EF Reverse Poco generated classes from your existing db to give you that added flexibility. Point your T4 at those classes using reflection instead of XML parsing (Look at T4 Toolbox for output file management) to get you started.
Reflecting on assemblies is a sticky bit when you get started. You need to make sure you have the EF/Poco assembly compiled so it can be loaded in memory and reflected on by your T4. On rare occasions, depending on how you are loading the assembly, it can stop refreshing the assembly so you have to restart VS. I run across this a couple times a month, so it wasn't a deal breaker for me. Once I got it up and running it made sense.

Related

NHibernate, Sessions, MVVM and Repositories

I am beginning to wonder if ANYONE uses NHibernate with a WPF or Win Forms application, such is the dearth of examples or text books on the subject. I am struggling to find "best practices" for its use, and especially session and sessionfactory management, with an MVVM WPF application and repositories.
To jump right in, it seems that the preference is to supply the repository with an ISession. But, where is this instantiated - in the ViewModel? - and if so, does this not created an uncomfortable dependency between the VM and NH (or is that just simply unavoidable, no matter how you dress it up?) Any implications for a multi-user application?
With the repository pattern - should I use one large repos. for all objects (and hence one session) or, as seems more manageable at first sight, should the repositories be split up in some logical business-related way? - but, if split up, how then to manage sessions? In my case, a form/window does not just deal with one entity (maybe it should...?) but with more than one. I don't want the ORM side to be dictated by the UI form design (maybe it should!?)
And then again, SessionFactory - where, and when to create it - once, at app startup?
Any good pointers or references for an NH app that is not web-based would be much appreciated.
Here is a reference to a similar question, but it was posed over four years ago: Using Unit of Work design pattern / NHibernate Sessions in an MVVM WPF
Many thanks
I've been using NHibernate with MVVM for years, once you get it going it's great. The MSDN article Building a Desktop To-Do Application with NHibernate covers the whole issue of session management rather well and is definitely worth a read.
One thing that will make life a lot easier is the use of a good dependency injection framework. I personally use Ninject and one of the things I particularly like is its support for object scoping. For example you can set your NHibernate session object (and thus the entity repositories) to scope to the pages in your application using InScope, so anything within the hierarchy of a given page that asks the injection framework for a session object will all get a pointer to the same instance.
Lots of other advantages to going down this route, for example it's very easy to use things like Castle Dynamic Proxy to inject property change notification to classes so that the entities you get back from your database queries support it automatically and thus can be bound to directly in the view or subscribed to by other class instances in your model or view model. Same goes for lists, which can be problematic because replacing a database entity list with an ObservableCollection<> can cause the database to think the entire list has changed which in turn causes performance problems when every single entity starts serializing itself back to disk regardless of whether or not it has actually changed.

Using the Entity Framework and ADO in a hybrid setting for transition?

We have an app in ASP.NET MVC 3 that, due to legacy and porting reasons, is written entirely using traditional ADO.NET for the data layer.
I am now tasked with adding some reporting to this website, and the reports can result in some extremely complicated queries.
Are there any pitfalls in using the EF Power Tools to reverse-engineer a code first model and using it side-by-side with our current ADO.NET model? Doing so would allow me to use LINQ for querying the data I need, greatly speeding up the time required to write each report. I would need to shut off data context initialization, as we have our current model do that, but are there any glaring risks or problems associated with trying to do this?
If it's of any relevance (I know EF 5 has a ton of new features), we are using .NET 4 and will begin moving to .NET 4.5 as soon as it launches.
I think this is a very sensible thing to do. You could also use a database-first model, which you can refresh whenever the database changes and which does not try to initialize a database.
Since you will use the context read-only you can optimize the query process by setting the MergeOption property of ObjectQuerys to MergeOption.NoTracking. This reduces overhead because the context will not track changes of the generated objects.
A problem might be that there is more maintenance if the database changes, but I think the absence of walls of boiler-plate query code for reporting on the old data layer far outweighs that.
One day :) you may even decide to use the EF model to display data that users want to filter in the UI and use the old data layer for CUD commands. (a bit like CQRS).

Entity Framework 4.1 - Going crazy with the number of options

I've built a site in MVC3 using EF 4.0 using the Repository pattern. Everything was going good but I'm starting to run into a lot of "The relationship between the two objects cannot be defined because they are attached to different ObjectContext objects" errors. It seems that my repository layer is getting the contexts all mixed up, so I figured it might just be easier to start a new EF4.1 project.
At first I looked into Repository Pattern + Unit of Work, but came across some threads saying that this isn't needed for EF4.1. I came across this thread saying "DbContext is implementation of unit of work pattern and IDbSet is implementation of repository pattern.". I figured maybe then I could just use that. Upon further inspection though it seems that DbContext uses the Code First approach, which as far as I can tell will drop and create the database again if the POCOs change. I need to keep the data in my database, so as far as I can tell that option is out.
My head is spinning right now with EF options. Is the Repository pattern needed with EF4.1? Is DbContext meant for working on databases that are already full of data? Is there a better way of managing the entity contexts that don't involve these?
Any push in the right direction would be great =/
A few comments. For details I recommend to do some basic research using a search engine.
...I'm starting to run into a lot of "The relationship between the two
objects cannot be defined because they are attached to different
ObjectContext objects" errors. It seems that my repository layer is
getting the contexts all mixed up, so I figured it might just be
easier to start a new EF4.1 project.
If you have this error you did something wrong. EF 4.1 won't protect you to do the same mistake again because you also cannot change relationships between objects that are attached to different DbContexts. You just have to analyze and debug your code and find the source of the problem.
...this thread saying "DbContext is implementation of unit of work
pattern and IDbSet is implementation of repository pattern.". I
figured maybe then I could just use that...
ObjectContext and ObjectSet<T> is an implementation of those patterns as well. This is no reason to change the version of Entity Framework.
Upon further inspection though it seems that DbContext uses the Code
First approach...
You can also use Database First and Model First approach with DbContext.
...which as far as I can tell will drop and create the database again
if the POCOs change. I need to keep the data in my database, so as far
as I can tell that option is out.
You can turn that feature off. Also, EF 4.3 has a migration feature which helps to update and evolve an existing database schema.
Is the Repository pattern needed with EF4.1?
No. It's also not needed for ObjectContext. To be precise, it's not needed that you write your own (abstract) repository of top of EF, because EF is already an implementation of that pattern.
Is DbContext meant for working on databases that are already full of
data?
Yes. The additional feature to create a database from code (Code-First) is mainly a productivity tool for the development phase of your application which is supposed to be disabled in production.

Is Entity Framework 4.1 is the best solution for a web application that is using almost 400 database tables?

Is Entity Framework 4.1 is the best solution for a web application that is using almost 400 database tables or it would best to make cutome data access layer with straight sql and sp?
The number of tables only affects EF initialization where "views" must be compiled when the context is used first time in the application - 400 is a lot and it will take a lot of time. This can be speed up by generating source code for views and adding these source codes to the project - views will not be compiled at runtime because compiled code will be part of your application but you must manually do this each time you change the model. For EFv4.1 this feature is offered in EF Power Tools CTP1. For EDMX the feature is offered in EdmGen command line tool.
Another impact on such number of tables in on development. Using 400 tables in single EDMX seems impossible so you will need multiple contexts with different sets of mapped entities. This can be complex task for application architecture because working with multiple contexts makes everything harder.
If you want to use code only mapping you must either write classes and mapping for 400 tables or you will again use EF Power Tools CTP1 which will generate them for you.
It is not impossible to use EF with 400 tables but it is complex and requires some experience.
Different people may have different views on it. Answer to your question is how you use EF whether it will be poco model or you are going to use edmx? EF performance depends on the number of records as well, so if you are using 10 tables and altogether they have 100 records then ef will perform better.
But still performance of ef on the basis of tables and records is a big question and it depends on various factors including your database designing whether it is properly normalised or not.
We do have more than 1000 tables, and we are using it (its a web app). You do not have to put everything in a single model, that would make it pretty darn hard to work on the model itself. And we are using Code Only approach, its in development yet, but everything works fine so far.

How do you manage schema upgrades to a production database?

This seems to be an overlooked area that could really use some insight. What are your best practices for:
making an upgrade procedure
backing out in case of errors
syncing code and database changes
testing prior to deployment
mechanics of modifying the table
etc...
Liquibase
liquibase.org:
it understands hibernate definitions.
it generates better schema update sql than hibernate
it logs which upgrades have been made to a database
it handles two-step changes (i.e. delete a column "foo" and then rename a different column to "foo")
it handles the concept of conditional upgrades
the developer actually listens to the community (with hibernate if you are not in the "in" crowd or a newbie -- you are basically ignored.)
http://www.liquibase.org
opinion
the application should never handle a schema update. This is a disaster waiting to happen. Data outlasts the applications and as soon as multiple applications try to work with the same data ( the production app + a reporting app for example) -- chances are they will both use the same underlying company libraries... and then both programs decide to do their own db upgrade ... have fun with that mess.
I am a big fan of Red Gate products that help creating SQL packages to update database schemas. The database scripts can be added to source control to help with versioning and rollback.
In general my rule is: "The application should manage it's own schema."
This means schema upgrade scripts are part of any upgrade package for the application and run automatically when the application starts. In case of errors the application fails to start and the upgrade script transaction is not committed. The downside to this is that the application has to have full modification access to the schema (this annoys DBAs).
I've had great success using Hibernates SchemaUpdate feature to manage the table structures. Leaving the upgrade scripts to only handle actual data initialization and occasional removing of columns (SchemaUpdate doesn't do that).
Regarding testing, since the upgrades are part of the application, testing them becomes part of the test cycle for the application.
Afterthought: Taking on board some of the criticism in other posts here, note the rule says "it's own". It only really applies where the application owns the schema as is generally the case with software sold as a product. If your software is sharing a database with other software, use other methods.
That's a great question. ( There is a high chance this is going to end up a normalised versus denormalised database debate..which I am not going to start... okay now for some input.)
some off the top of my head things I have done (will add more when I have some more time or need a break)
client design - this is where the VB method of inline sql (even with prepared statements) gets you into trouble. You can spend AGES just finding those statements. If you use something like Hibernate and put as much SQL into named queries you have a single place for most of the sql (nothing worse than trying to test sql that is inside of some IF statement and you just don't hit the "trigger" criteria in your testing for that IF statement). Prior to using hibernate (or other orms') when I would do SQL directly in JDBC or ODBC I would put all the sql statements as either public fields of an object (with a naming convention) or in a property file (also with a naming convention for the values say PREP_STMT_xxxx. And use either reflection or iterate over the values at startup in a) test cases b) startup of the application (some rdbms allow you to pre-compile with prepared statements before execution, so on startup post login I would pre-compile the prep-stmts at startup to make the application self testing. Even for 100's of statements on a good rdbms thats only a few seconds. and only once. And it has saved my butt a lot. On one project the DBA's wouldn't communicate (a different team, in a different country) and the schema seemed to change NIGHTLY, for no reason. And each morning we got a list of exactly where it broke the application, on startup.
If you need adhoc functionality , put it in a well named class (ie. again a naming convention helps with auto mated testing) that acts as some sort of factory for you query (ie. it builds the query). You are going to have to write the equivalent code anyway right, just put in a place you can test it. You can even write some basic test methods on the same object or in a separate class.
If you can , also try to use stored procedures. They are a bit harder to test as above. Some db's also don't pre-validate the sql in stored procs against the schema at compile time only at run time. It usually involves say taking a copy of the schema structure (no data) and then creating all stored procs against this copy (in case the db team making the changes DIDn't validate correctly). Thus the structure can be checked. but as a point of change management stored procs are great. On change all get it. Especially when the db changes are a result of business process changes. And all languages (java, vb, etc get the change )
I usually also setup a table I use called system_setting etc. In this table we keep a VERSION identifier. This is so that client libraries can connection and validate if they are valid for this version of the schema. Depending on the changes to your schema, you don't want to allow clients to connect if they can corrupt your schema (ie. you don't have a lot of referential rules in the db, but on the client). It depends if you are also going to have multiple client versions (which does happen in NON - web apps, ie. they are running the wrong binary). You could also have batch tools etc. Another approach which I have also done is define a set of schema to operation versions in some sort of property file or again in a system_info table. This table is loaded on login, and then used by each "manager" (I usually have some sort of client side api to do most db stuff) to validate for that operation if it is the right version. Thus most operations can succeed, but you can also fail (throw some exception) on out of date methods and tells you WHY.
managing the change to schema -> do you update the table or add 1-1 relationships to new tables ? I have seen a lot of shops which always access data via a view for this reason. This allows table names to change , columns etc. I have played with the idea of actually treating views like interfaces in COM. ie. you add a new VIEW for new functionality / versions. Often, what gets you here is that you can have a lot of reports (especially end user custom reports) that assume table formats. The views allow you to deploy a new table format but support existing client apps (remember all those pesky adhoc reports).
Also, need to write update and rollback scripts. and again TEST, TEST, TEST...
------------ OKAY - THIS IS A BIT RANDOM DISCUSSION TIME --------------
Actually had a large commercial project (ie. software shop) where we had the same problem. The architecture was a 2 tier and they were using a product a bit like PHP but pre-php. Same thing. different name. anyway i came in in version 2....
It was costing A LOT OF MONEY to do upgrades. A lot. ie. give away weeks of free consulting time on site.
And it was getting to the point of wanting to either add new features or optimize the code. Some of the existing code used stored procedures , so we had common points where we could manage code. but other areas were this embedded sql markup in html. Which was great for getting to market quickly but with each interaction of new features the cost at least doubled to test and maintain. So when we were looking at pulling out the php type code out, putting in data layers (this was 2001-2002, pre any ORM's etc) and adding a lot of new features (customer feedback) looked at this issue of how to engineer UPGRADES into the system. Which is a big deal, as upgrades cost a lot of money to do correctly. Now, most patterns and all the other stuff people discuss with a degree of energy deals with OO code that is running, but what about the fact that your data has to a) integrate to this logic, b) the meaning and also the structure of the data can change over time, and often due to the way data works you end up with a lot of sub process / applications in your clients organisation that needs that data -> ad hoc reporting or any complex custom reporting, as well as batch jobs that have been done for custom data feeds etc.
With this in mind i started playing with something a bit left of field. It also has a few assumptions. a) data is heavily read more than write. b) updates do happen, but not at bank levels ie. one or 2 a second say.
The idea was to apply a COM / Interface view to how data was accessed by clients over a set of CONCRETE tables (which varied with schema changes). You could create a seperate view for each type operation - update, delete, insert and read. This is important. The views would either map directly to a table , or allow you to trigger of a dummy table that does the real updates or inserts etc. What i actually wanted was some sort of trappable level indirection that could still be used by crystal reports etc. NOTE - For inserts , update and deletes you could also use stored procs. And you had a version for each version of the product. That way your version 1.0 had its version of the schema, and if the tables changed, you would still have the version 1.0 VIEWS but with NEW backend logic to map to the new tables as needed, but you also had version 2.0 views that would support new fields etc. This was really just to support ad hoc reporting, which if your a BUSINESS person and not a coder is probably the whole point of why you have the product. (your product can be crap but if you have the best reporting in the world you can still win, the reverse is true - your product can be the best feature wise, but if its the worse on reporting you can very easily loose).
okay, hope some of those ideas help.
These are all weighty topics, but here is my recommendation for updating.
You did not specify your platform, but for NANT build environments I use Tarantino. For every database update you are ready to commit, you make a change script (using RedGate or another tool). When you build to production, Tarantino checks if the script has been run on the database (it adds a table to your database to keep track). If not, the script is run. It takes all the manual work (read: human error) out of managing database versions.
I've heard good things about iBATIS 3 Schema Migrations System:
User Guide: http://svn.apache.org/repos/asf/ibatis/java/ibatis-3/trunk/doc/en/iBATIS-3-Migrations.pdf
As Pat said, use liquibase. Especially when you have several developers with their own dev databases
making changes that will become part of the production database.
If there's only one dev, as on one project I'm on now(ha), I just commit the schema changes as SQL text files into a CVS repo, which I check out in batches on the production server when the code changes go in.
But liquibase is better organized than that!

Resources