I am building a DataAccess layer to a DB, what data structure is recommended to use to pass and return a collection?
I use a list of data access objects mapped to the db tables.
I'm not sure what language you're using, but in general, there are tradeoffs of simplicity vs extensibility.
If you return the DataSet directly, you have now coupled yourself to database specific classes. This leaves little room for extension - what if you allow access to files or to other types of data sources? But, it is also very simple. This is the recordset pattern and C#/VB provide a lot of built-in support for this. The GUI layer can access the recordset and easily manipulate the data. This works well for simple applications.
On the other hand, you can wrap the datasets in a custom object, and provide gateway methods (see the Gateway pattern http://martinfowler.com/eaaCatalog/gateway.html). This method is more complex, but provides a lot more extensibility. In a larger application when you need to separate the the business logic, data logic, and GUI logic, this is a more robust way to go.
For larger enterprise applications, you can look into using Object Relational Mapping tools (ORM). They help to automatically map java objects to database tables. They hide a lot of the painful SQL details. Frameworks such as Spring provide excellent support for ORMs.
I tend to use arrays of objects, so that I can disconnect the DAO from the business logic.
You can store the data in the DAO as a dataset, for example, and give them an easy way to add to the database before doing an update, so they can pass in information to do modification operations, and then when they want to commit the changes they can do it in one shot.
I prefer that the user can't add/modify the structure themselves, as it makes it harder to determine what must be changed in the database.
By initially returning an array they can then display what is in the database.
Then, as the presentation layer makes changes, the DAO can be updated by the controller. By having a loose coupling the entire system becomes more flexible, as you can change the DAO from a dataset to something else, and the rest of the application doesn't care.
There are two choices that are the most generic.
The first way to look at a ResultSet is as a List of Maps, where each Map represents a row in the ResultSet. The keys are the columns listed in the FROM clause; the values are the database values.
The second way to look at a ResultSet is as a Map of Lists, where each List represents a column in the ResultSet. The Map keys are the columns listed in the FROM clause; the values are the List of database values.
If you don't want to do full-blown ORM, these can carry you a long way.
Related
What I see a lot is that people use a Object Relational Mapper (ORM) for doing SQL stuff when working in a MVC environment. But if i really have complex queries I would like to write this whole query myself. What is the best practice for this kind of situation?
Having a Abstraction Layer between your model and the database with the complex queries
Still using the model with creating specific methodes that handle the queries
Or is there any other way that might be better? please tell me :)
Consider the Single Responsibility Principle. Specifically, the question would be...
"If I put data access logic in my model, what will that mean when I need to change something?"
Any time you need to change business logic, you're also changing the objects which maintain data access logic. So the data access logic also needs to be re-tested. Conversely, any time you need to change data access logic, you're also changing the objects which maintain business logic. So the business logic also needs to be re-tested.
As the logic expands, this becomes more difficult very quickly.
The idea behind the Single Responsibility Principle is to separate the dependencies of different roles which can enact changes to the application. (Keep in mind that "roles" doesn't map 1-to-1 with "people." One person may have multiple roles, but it's still important to separate those roles.) It's a matter of simpler support. If you want to make a change to a database query (say, for performance reasons) which shouldn't have any visible affect on anything else in the system, then there's no reason to be changing objects which contain business logic.
1. Having a Abstraction Layer between your model and the database with the complex queries
Yes, you should have a persistence abstraction that sits between storage (database or any other data source) and you business logic. Your business logic should not depend on "where", "how" and even "if" the data is actually stored.
Basically, your code should (at least - try to) adhere to SOLID principles, but as #david already pointed out: you are already violating the first on on that list.
Also, you should consider using a service layer which would be responsible for dealing with interaction between implementation of domain model and your persistence abstraction (doesn't matter whether you are using custom written data mappers or some 3rd party ORM).
In the article (more like excerpt, actually) the "MVC model" is actually all three concentric circles together. Domain model is not code. It actually is trm that describs the accumulated knowledge about the project. Most of domain model gets turned into pieces of code. Those pieces are referred to as domain objects.
2. Still using the model with creating specific methodes that handle the queries
This would imply implementation of active record. It is useful, but mostly misused pattern, for cases when your objects have no (or almost none) business logic. Basically - you should use active record only if all you need are glorified setter an getters, that talk to database.
Active record pattern is a very good choice when you need to quickly prototype something, but it should not be used, when you are attempting to implement fully realized model layer.
ORM's in general do not specifically have any drawbacks versus using direct SQL to fetch data from the database. ORM's as the name implies help in keeping your Relational model (designed using your SQL DDL's or using JPA annotations) and OO model in sync and help them integrate well together.
When using a ORM, you can write your queries in JPQL which is Object oriented SQL. So instead of writing queries that manipulate tables, you are writing queries that manipulate objects. You use the relationships between these objects to get your desired result. Now I understand that sometimes its easier to just write Native SQL, so the JPA specification allows you to run native sql! This just returns you list of "Generic Objects" which you can organize any way you like. When you choose to go this route and actually pick a JPA provider, like Hibernate, these providers have extended functionalities. So if you do have complex relationships you can use libraries like Hibernate Criteria Builder to help you create queries for those complex relationships.
So, if building a large MVC application, it would generally be a good idea to have this abstraction layer in the middle - handling all these relationships. It makes it easier on you the developer to just look at the big picture and the business side of the application.
Imho, no. I think, even the ORM layer adds often more complexity as needed. The databases have very good and sophisticated mechanisms for high-level data manipulation. Triggers, views, constraints, complex keying-indexing, (sub)transactions, stored procedures, and procedural extensions of the query language were normally much more as enough for everything.
The ORMs can't give, because of their structural barriers, a real interface to this feature set.
And the common practice is that the applications use practically only a nosql record service from all of this, and implement in an unneeded "middleware" which were the mission of the database.
Which I see really interesting, if the feature set of the databases got some OO-like interface (see "sql abstract types"), and the client-side logic went in the application (see "REST"). This practically eliminated the need of the middle layer.
I have an MVC3 NHibernate/ActiveRecord project. The project is going okay, and I'm getting a bit of use out of my model objects (mostly one giant hierarchy of three or four classes).
My application is analytics based; I store hierarchial data, and later slice it up, display it in graphs, etc. so the actual relationship is not that complicated.
So far, I haven't benefited much from ORM; it makes querying easy (ActiveRecord), but I frequently need less information than full objects, and I need to write "hard" queries through complex and multiple selects and iterations over collections -- raw SQL would be much faster and cleaner.
So I'm thinking about ditching ORM in this case, and going back to raw SQL. But I'm not sure how to rearchitect my solution. How should I handle the database tier?
Should I still have one class per model, with static methods to query for objects? Or should I have one class representing the DB?
Should I write my own layer under ActiveRecord (or my own ActiveRecord-like implementation) to keep the existing code more or less sound?
Should I combine ORM methods (like Save/Delete) into my model classes or not?
Should I change my table structure (one table per class with all of the fields)?
Any advice would be appreciated. I'm trying to figure out the best architecture and design to go with.
Many, including myself, think the ActiveRecord pattern is an anti-pattern mainly because it breaks the SRP and doesn't allow POCO objects (tightly coupling your domain to a particular ORM).
In saying that, you can't beat an ORM for simple CRUD stuff, so I would keep some kind of ORM around for that kind of work. Just re-architect your application to use POCO objects and some kind or repository pattern with your ORM implementation specifics in another project.
As for your "hard" queries, I would consider creating one class per view using a tiny ORM (like Dapper, PetaPoco, or Massive), to query the objects with your own raw sql.
For application developers, I suppose the traditional paradigm for writing an application with domain objects that can be persisted to an underlying data store (SQL database for arguments sake), is to write the domain objects and then write (or generate) the table structure. There is a tight coupling between what the domain object looks like and what the structure of underlying data store looks like. So if you want to add a piece of information to your domain object, you add the field to your code and then add a column to the appropriate database table. All familiar?
This is all well and good for data stores that have a well defined structure (I'm mainly talking about SQL databases whereby the tables and columns are pre-defined and fixed), but now a number of alternatives to the ubiquitous SQL database exist and these often do not constrain the data in this way. For instance, MongoDB is a NoSQL database whereby you divide data into collections but aside from that there is no structuring of the data. You don't define new columns when you want to add a new field.
Now to the question: given the flexibility of a data store like MongoDB, how would one go about achieving a similar kind of flexibility in the domain objects that represent this data? So for instance if I'm using Spring and creating my own domain obejcts, when I add a "middleName" field to my data, how can I avoid having to add a "middleName" field to my domain object? I'm looking for some kind of mechanism/approach/framework to dynamically inspect the data and have access to it in my domain object without having to make a code change every time. All ideas welcome.
I think you have a couple of choices:
You can use a dynamic programming language and not have domain objects (clojure for example)
If you're fixed on using java, the mongo java driver returns data in DBObject which is essentially a Map. So the default behavior already provides what you want. It's only when you map the DBObject into domain objects, using a library like morphia (or spring-data), that you even have to worry about domain objects at all.
But, if I was using java, I would stick with the standard convention of domain objects mapped via morphia, because I think adding a field is a very minor inconvenience when compared against the benefits.
I think the question is inherintly paradoxical-
On one hand, you want to have domain objects, i.e. objects that represent the data (and behaviour) of your problem domain.
On the other hand, you say that you don't want your domain objects to be explicitly influenced by changes to the data.
But when you have objects that represent your problem domain, you want to do just that- to represent your problem domain.
So that if, for example, middle name is added, then your representation of the real-life 'User' entity should change to accomodate this change to the real-life user; perhaps not only by adding this piece of data to your object, but also adding some related behaviour (validation of middle name, or some functionality related to it).
In essense, what I'm trying to say here is that when you have (classic OO) domain objects, you may need to change your behaviour / functionality along with your data, and since you don't have any automatic way of changing your behaviour, the question of automatically changing your data becomes irrelevant.
If you don't want behaviour associated with your data, then you essentialy have DTOs, and #Kevin's answer is what you're looking for.
Honestly, it sounds more like you're looking for some kind of blackbox DTO where, like you describe, fields are added or removed "arbitrarily" depending on the data. This makes me inclined to suggest a simple Map to do the job. You can't really have a domain-driven design if your domain model is constantly changing.
I have predefined tables in the database based on which I have to develop a web application.
Should I base my model classes on the structure of data in the tables.
But a problem is that the tables are very poorly defined and there is much redundant data in them (which I can not change!).
Eg. in 2 tables three columns are same.
Table: Student_details
Student_id , Name, AGe, Class ,School
Table :Student_address
Student_id,Name,Age, Street1,Street2,City
I think you should make your models in a way that would be best suited for how they will be used. Don't worry about how the data is stored or where it is stored... otherwise why go through the trouble of layering your code. Why not just do the direct DB query right in your view? So if you are going to create an abstraction of your data... "model" ... make one that is designed around how it will be used... not how it will be or is persisted.
This seems like a risky project - presumably, there's another application somewhere which populates these tables. As the data model is not very sound from a relational point of view, I'm guessing there's a bunch of business/data logic glued into that app - for instance, putting the student age into the StudentAddress table.
I'd support jsobo in recommending you build your business logic independently of the underlying persistance mechanism, and that you try to keep your models as domain focused as possible, without too much emphasis on how the database happens to be structured.
You should, however, plan on spending a certain amount of time translating your domain models into their respective data representations and dealing with whatever quirks the data model imposes. I'd strongly recommend containing all this stuff in a separate translation layer - don't litter it throughout the rest of the application.
When using LINQ to SQL or Entity framework,shall we need to separate application in 3 layers?BLL,DAL,Interface?
Do what works for you. Building a wedding website with a handful of links and getting 5 content pages out of the database? More than 1 layer seems like tremendous overkill. On the flip side, for a very complex or large project: I think you'd want at least some degree separation because it saves time, confusion and sanity.
It matters what you're working on and how much division it requires. Ultimately it's what you and your team prefer. There's no right answer, it's what fits the situation.
in projects I've been developing, I find value in creating a DL even when using Linq2Sql for data access.
My main reason is because many of the calls to the DL, to retreive one or more business objects from the DB, actually require more than one call to the database, especially when implementing an eager-loading strategy. and when saving a business object, whose data is stored in multiple tables, a transaction can be established across multiple calls to the database.
The business layer doesn't need to know that; it should be able to make a single call to the DL and leave it to the DL to do all the tedious querying and collation of data into business objects.
I'm with #MikeJacobs.
I've actually written a LINQ2SQL library which abstracts ALL the DataContext stuff, and all the .Insert(), .Execute() and .SubmitChanges().
It's really nice to just abstract that away. In LINQ2SQL, you're still dependant on all your layers knowing about the LINQ2SQL Entities, but my top layers is very rarely sending complex lambdas to the DAL, most of that's done in the DAL.