AsQueryable() in Business logic layer - bad practice? (I think so but....)

AsQueryable() in Business logic layer - bad practice? (I think so but....) - linq

Say for example I have (pseudo code):
public IEnumerable<User> GetUsers(string name) in my data access layer to Entity Framework, which at the moment does a .ToList() before returning, thus ensuring my business logic layer cannot interfere with my data access layer.
However I need a slightly different variation on this in my business logic layer, for example I need less data (e.g. just userid's or to further filter it).
To have an efficient DB layer I'd want another method returning a subset of the data (overloaded method or whatever).
However, I could "cheat" and omit the ToList(), and my business logic layer tacks on the end an AsQueryable(). Thus my business logic layer is able to manipulate the underlying sql that is created.
What are people's thoughts on AsQueryable() in business logic layers? It seems to me that this is a leaky abstraction over my data access layer, but it could be incredibly convenient, and perhaps because it's in the LINQ namespace (rather than EF namespace), that it's not all that bad to use?
EDIT
Something useful to watch out for (and argument against omitting ToList()), is that if code calling it was previously relying on the ToList() for databinding i.e. to avoid the error "Data binding directly to a store query (DbSet, DbQuery, DbSqlQuery) is not supported." You won't get a compile time error, just a runtime error. So you need to make sure ToList() is certainly called before the UI tier.

I would personally just add a second method that does a ToList() in my data access layer and call that. It's neater that way:
functionA()
{
return myDB.entityA.AsQueryable();
}
functionB()
{
return functionA().ToList();
}
You're probably going to need to call that same function from somewhere else sometime in the future. Keep it in one place.

Related

In MVC, should the Model or the Controller be processing and parsing data?

Until now, in my MVC application, I've been using the Model mainly just to access the database, and very little else. I've always looked on the Controller as the true brains of the operation. But I'm not sure if I've been correctly utilizing the MVC model.
For example, assume a database of financial transactions (order number, order items, amount, customer info, etc.). Now, assume there is a function to process a .csv file, and return it as an array, to be inserted into the database of transactions.
I've placed my .csv parse function in my Controller, then the controller passes the parsed information to a function in the Model to be inserted. However, strictly speaking, should the .csv parsing function be included in the Model instead?
EDIT: For clarity's sake, I specifically am using CodeIgniter, however the question does pertain to MVC structure in general.

The internet is full of discussion about what is true MVC. This answer is from the perspective of the CodeIgniter (CI) implementation of MVC. Read the official line here.
As it says on the linked page "CodeIgniter has a fairly loose approach to MVC...". Which, IMO, means there aren't any truly wrong ways to do things. That said, the MVC pattern is a pretty good way to achieve Separation of Concerns (SoC) (defined here). CI will allow you to follow the MVC pattern while, as the linked documentation page says, "...enabling you to work in a way that makes the most sense to you."
Models need not be restricted to database functions. (Though if that makes sense to you, then by all means, do it.) Many CI developers put all kinds of "business logic" in Models. Often this logic could just as easily reside in a custom library. I've often had cases where that "business logic" is so trivial it makes perfect sense to have it in a Controller. So, strictly speaking - there really isn't any strictly speaking.
In your case, and as one of the comments suggests, it might make sense to put the CSV functionality into a library (a.k.a. service). That makes it easy to use in multiple places - either Controller or Model.
Ultimately you want to keep any given block of code relevant to, and only to, the task at hand. Hopefully this can be done in a way that keeps the code DRY (Don't Repeat Yourself). It's up to you to determine how to achieve the desired end result.
You get to decide what the terms Model, View, and Controller mean.

As a general rule MVC is popular because it supports separation of concerns, which is a core tenet of SOLID programming. Speaking generically (different flavors support/ recommend different implementations), your model holds your data (and often metadata for how to validate or parse), your view renders your data, and your controller manages the flow of your data (this is also usually where security and validation occur).
In most systems, the Single Responsibility Principle would suggest that while business logic must occur at the controller level, it shouldn't actually occur in the controller class. Typically, business logic is done in a service, usually injected into the controller. The controller invokes the service with data from the model, gets a result that goes into the model (or a different model), and invokes the view to render it.
So in answer to your question, following "best practices" (and I'll put that in quotes because there's a lot of opinions out there and it's not a black and white proposition), your controller should not be processing and parsing data, and neither should your model; it should be invoking the service that processes and parses the data, then returning the results of aforementioned invocation.
Now... is it necessary to do that in a service? No. You may find it more appropriate, given the size and complexity of your application (i.e. small and not requiring regular maintenance and updates) to take some shortcuts and put the business logic into the controller or the model; it's not like it won't work. If you are following or intend to follow the intent of the Separation of Concerns and SOLID principles, however (and it's a good idea on larger, more complex projects), it's best to refactor that out.

Back to the old concept of decomposing the project logic as Models and Business Logic and the Data Access Layer.
Models was the very thin layer to represent the objects
Business Logic layer was for validations and calling the methods and for processing the data
Data Access Layer for connecting the database and to provide the OR relation
in the MVC, and taking asp.net/tutorials as reference:
Models : to store all the object structure
View: is like an engine to display the data was sent from the controller ( you can think about the view as the xsl file of the xml which is models in this case)
Controller: the place where you call the methods and to execute the processes.
usually you can extend the models to support the validations
finally, in my opinion and based on my experience, most of the sensitive processes that take some execution time, I code it on the sql server side for better performance, and easy to update the procedures in case if any rule was injected or some adjustments was required, all mentioned can be done without rebuilding your application.
I hope my answer gives you some hints and to help you

If your CSV processing is used in more than one place, you can use a CI library to store the processing function. Or you can create a CSV model to store the processing function. That is up to you. I would initially code this in the controller, then if needed again elsewhere, that is when I would factor it out into a library.
Traditionally, models interact with the database, controllers deal with the incoming request, how to process it, what view to respond with. That leaves a layer of business logic (for instance your CSV processing) which I would put in a library, but many would put in its own model.
There is no hard rule about this. MVC, however it was initially proposed, is a loose term interpreted differently in different environments.
Personally, with CI, I use thin controllers, fat models that also contain business logic, and processing logic like CSV parsing I would put in a library, for ease of reuse between projects.

How to name the layer between Controller and Model codeigniter MVC

I wanna restrict model to calling to db only
while controller will call model, libraries or helpers.
I do not want to put logic in controller nor in the model to prepare data for views.
Now the logic for preparing all the arrays for views are done in controller. I am creating a library to separate this part as sometimes i feel it is overloading the controller
Hence, i want to create a library class and make controller build the view data before throwing it to the view. It is not exactly templating.
The thing is i do not know how to name it.. Any good suggestion ?
I am thinking view_builder, ui_builder, ui_components?
Cheers

Here's how I'd layer the app:
View
Controller
Service
Persistence
View is either desktop or browser or mobile-based.
Controller is tightly bound to view. It's responsible for validating and binding input to model objects, calling services to fulfill use cases, and routing the response to the next view.
Services fulfill use cases. They know about units of work, own transactions, and manage connections to resources like databases. They work with model objects, other services, and persistence objects. They're interface-based objects, but can be remoted or exposed as web services - RPC-XML, SOAP, REST or other.
Persistence is another interfaced-based object. The implementation can be relational or NoSQL; the important thing is that the interface expresses CRUD operations for model objects. If you use generics, it's possible to write one interface that works for all.
I wouldn't have model objects persist themselves. I'm aware of the "anemic domain model" pejorative, but I think more exciting behavior should center around the business purpose, not CRUD operations.

Good setup. I also sometimes use CI libraries to work out the kinks in a returned data array before passing it to a view. I also sometimes just use the model.
And good for you for thinking about names - I think all the ones you mention are fine; you could also think about naming your library something like data_structure or array_to_object - or something more specific to your own problem like friend_map or tag_cloud.
My advice: pick a name, and then don't be afraid to change it if something more descriptive comes along or the function of your library evolves into something else. Find+replace is your friend.

Is there a better way to perform validation in Entity Framework 4 than overriding SavingChanges?

It feels a little wrong to me to be doing Validation in your context which is your data layer. For the data layer to have a dependency on validation seems a little backwards. It makes more sense to me for validation to be done in business logic and once the validation is complete then persist the changes. The data layer should be able to rely on the fact that the data is valid.
The second reason I don't really like overriding SavingChanges is that it seems very messy and gets full of code quickly. When you have a complex object graph that method is going to get very messy VERY quickly with a lot of
if (entity is MyType)
{
//perform MyType validation
}
Third reason I don't really like this is all the extra validator objects that I would have to pass in the constructor of the object context. I am using a DI framework but it still feels messy.
I suppose I only have this problem because I am letting people use the repositories. I could add a layer of indirection and then they would have to use some sort of DTO. The downside here is it seems like a lot of extra work. Also the constant mapping that would have to take place.
Is there anyway that you could give access to repositories and to do validation without sticking everything into one SavingChanges method?

If your code is ugly, write nicer code. :)
var entities = context.ObjectStateManager.GetObjectStateEntries(EntityState.Added | EntityState.Modified))
.Select(ose => ose.Entity);
var errors = entities.OfType<IMyValidationInterface>()
.SelectMany(e => e.Validate());
...and done!

EF4, Lambda, Repository pattern and DTOs

I have a semi complicated question regarding Entity Framework4, Lambda expressions, and Data Transfer Objects (DTO).
So I have a small EF4 project, and following established OO principles, I have a DTO to provide a layer of abstraction between the data consumers (GUI) and the data model.
VideoDTO = DTO with getters/setters, used by the GUI
VideoEntity = Entity generated by EF4
My question revolves around the use of the DTO by the GUI (and not having the GUI use the Entity at all), combined with a need to pass a lambda to the data layer. My data layer is a basic repository pattern with Add. Change, Delete, Get, GetList, etc.
Trying to implement a Find method with a signature like so:
public IEnumerable<VideoDTO> Find(Expression<Func<VideoEntity, bool>> exp)
...
_dataModel.Videos.Where(exp).ToList<Video>()
---
My problem/concern is the "exp" needing to be of type VideoEntity instead of VideoDTO. I want to preserve the separation of concerns so that the GUI does not know about the Entity objects. But if I try to pass in
Func<VideoDTO, bool>
I cannot then do a LINQ Where on that expression using the actual data model.
Is there a way to convert a Func<VideoDTO,bool> to a Func<VideoEntity, bool>
Ideally my method signature would accept Func<VideoDTO, bool> and that way the GUI would have no reference to the underlying data entity.
Is this clear enough? Thanks for your help
Thanks for the repliesto both of you.
I'll try the idea of defining the search criteria in an object and using that in the LINQ expression. Just starting out with both EF4 and L2S, using this as a learning project.
Thanks again!

In architectures like CQRS there isn't need for such a conversion at all cause read & write sides of app are separated.
But in Your case, You can't runaway from translation.
First of all - You should be more specific when defining repositories. Repository signature is thing You want to keep explicit instead of generic.
Common example to show this idea - can You tell what indexes You need in Your database when You look at Your repository signature (maybe looking at repository implementation, but certainly w/o looking at client code)? You can't. Cause it's too generic and client side can search by anything.
In Your example it's a bit better cause expression genericness is tied with dto instead of entity.
This is what I do (using NHibernate.Linq, but the idea remains)
public class Application{
public Project Project {get;set;}
}
public class ApplicationRepository{
public IEnumerable<Application> Search(SearchCriteria inp){
var c=Session.Linq<Application>();
var q=c.AsQueryable();
if(!string.IsNullOrEmpty(inp.Acronym))
q=q.Where(a=>a.Project.Acronym.Contains(inp.Acronym));
/*~20 lines of similar code snipped*/
return q.AsQueryable();
}
}
//used by client
public class SearchCriteria{
public string Acronym{get;set;}
/*some more fields that defines how we can search Applications*/
}
If You do want to keep Your expressions, one way would be to define dictionary manually like this:
var d=new Dictionary<Expression<Func<VideoDTO,object>>,
Expression<Func<VideoEntity,object>>{
{x=>x.DtoPropNumberOne,x=>x.EntityPropNumberOne} /*, {2}, {3}, etc.*/
};
And use it later:
//can You spot it?
//client does not know explicitly what expressions dictionary contains
_dataModel.Videos.Where(d[exp]).ToList<Video>();
//and I'm not 100% sure checking expression equality would actually work
If You don't want to write mapping dictionary manually, You will need some advanced techniques. One idea would be to translate dto expression to string and then back to entity expression. Here are some ideas (sorting related though) that might help. Expressions are quite complicated beasts.
Anyway - as I said, You should avoid this. Otherwise - You will produce really fragile code.

Perhaps your design goal is to prevent propagation of the data model entities to the client tier rather than to prevent a dependency between the presentation layer and data model. If viewed that way then there would be nothing wrong with the query being formed the way you state.
To go further you could expose the searchable fields from VideoEntity via an interface (IVideoEntityQueryFields) and use that as the type in the expression.
If you don't want to add an interface to your entities then the more complicated option is to use a VideoEntityQuery object and something that translates an Expression<Func<VideoEntityQuery,bool>> to an Expression<Func<VideoEntity,bool>>.

Do you ToList()?

Do you have a default type that you prefer to use in your dealings with the results of LINQ queries?
By default LINQ will return an IEnumerable<> or maybe an IOrderedEnumerable<>. We have found that a List<> is generally more useful to us, so have adopted a habit of ToList()ing our queries most of the time, and certainly using List<> in our function arguments and return values.
The only exception to this has been in LINQ to SQL where calling .ToList() would enumerate the IEnumerable prematurely.
We are also using WCF extensively, the default collection type of which is System.Array. We always change this to System.Collections.Generic.List in the Service Reference Settings dialog in VS2008 for consistency with the rest of our codebase.
What do you do?

ToList always evaluates the sequence immediately - not just in LINQ to SQL. If you want that, that's fine - but it's not always appropriate.
Personally I would try to avoid declaring that you return List<T> directly - usually IList<T> is more appropriate, and allows you to change to a different implementation later on. Of course, there are some operations which are only specified on List<T> itself... this sort of decision is always tricky.
EDIT: (I would have put this in a comment, but it would be too bulky.) Deferred execution allows you to deal with data sources which are too big to fit in memory. For instance, if you're processing log files - transforming them from one format to another, uploading them into a database, working out some stats, or something like that - you may very well be able to handle arbitrary amounts of data by streaming it, but you really don't want to suck everything into memory. This may not be a concern for your particular application, but it's something to bear in mind.

We have the same scenario - WCF communications to a server, the server uses LINQtoSQL.
We use .ToArray() when requesting objects from the server, because it's "illegal" for the client to change the list. (Meaning, there is no purpose to support ".Add", ".Remove", etc).
While still on the server, however, I would recommend that you leave it as it's default (which is not IEnumerable, but rather IQueryable). This way, if you want to filter even more based on some criteria, the filtering is STILL on the SQL side until evaluated.
This is a very important point as it means incredible performance gains or losses depending on what you do.
EXAMPLE:
// This is just an example... imagine this is on the server only. It's the
// basic method that gets the list of clients.
private IEnumerable<Client> GetClients()
{
var result = MyDataContext.Clients;
return result.AsEnumerable();
}
// This method here is actually called by the user...
public Client[] GetClientsForLoggedInUser()
{
var clients = GetClients().Where(client=> client.Owner == currentUser);
return clients.ToArray();
}
Do you see what's happening there? The "GetClients" method is going to force a download of ALL 'clients' from the database... THEN the Where clause will happen in the GetClientsForLoogedInUser method to filter it down.
Now, notice the slight change:
private IQueryable<Client> GetClients()
{
var result = MyDataContext.Clients;
return result.AsQueryable();
}
Now, the actual evaluation won't happen until ".ToArray" is called... and SQL will do the filtering. MUCH better!

In the Linq-to-Objects case, returning List<T> from a function isn't as nice as returning IList<T>, as THE VENERABLE SKEET points out. But often you can still do better than that. If the thing you are returning ought to be immutable, IList is a bad choice because it invites the caller to add or remove things.
For example, sometimes you have a method or property that returns the result of a Linq query or uses yield return to lazily generate a list, and then you realise that it would be better to do that the first time you're called, cache the result in a List<T> and return the cached version thereafter. That's when returning IList may be a bad idea, because the caller may modify the list for their own purposes, which will then corrupt your cache, making their changes visible to all other callers.
Better to return IEnumerable<T>, so all they have is forward iteration. And if the caller wants rapid random access, i.e. they wish they could use [] to access by index, they can use ElementAt, which Linq defines so that it quietly sniffs for IList and uses that if available, and if not it does the dumb linear lookup.
One thing I've used ToList for is when I've got a complex system of Linq expressions mixed with custom operators that use yield return to filter or transform lists. Stepping through in the debugger can get mighty confusing as it jumps around doing lazy evaluation, so I sometimes temporarily add a ToList() to a few places so that I can more easily follow the execution path. (Although if the things you are executing have side-effects, this can change the meaning of the program.)

It depends if you need to modify the collection. I like to use an Array when I know that no one is going to add/delete items. I use a list when I need to sort/add/delete items. But, usually I just leave it as IEnumerable as long as I can.

If you don't need the added features of List<>, why not just stick with IQueryable<> ?!?!?! Lowest common denominator is the best solution (especially when you see Timothy's answer).

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio