Is there an implementation of IQueryable over DbDataReader? - linq

I have a lot of existing code which uses raw ADO.NET (DbConnection, DbDataReader, etc). I would like to transition to using LINQ to SQL for new code, but for now put both the existing and new code behind a unified set of Repository classes.
One issue I have is this: I would like the Repository classes to expose result sets as IQueryable<> which I get for free with LINQ to SQL. How do I wrap my existing DbDataReader result sets in an IQueryable? Do I have to implement IQueryable over DbDataReader from scratch?
Note I am aware of LINQ to DataSet, but I don't use DataSets because of memory scale issues, as the result sets I deal with can be quite large (order of 1000s). This implies that the IQueryable over DbDataReader implementation will need to be efficient as well (i.e. don't cache results in memory).

I can't see any benefit in implement IQueryable<T> - that suggests more functionality than is actually available - however, you could implement it as an IEnumerable<T> easily enough, with the caveat that it is once-only. An iterator block would be a reasonable choice:
public static IEnumerable<IDataRecord> AsEnumerable(
this IDataReader reader)
{
while (reader.Read())
{
yield return reader; // a bit dangerous
}
}
The "a bit dangerous" is because the caller could cast it back and abuse it...

Related

What is the benefit of using IQueryable and LINQ queries?

I have a project where was realized own configuration classes:
IconSizesConfigSection: ConfigurationSection
IconSizesCollection: ConfigurationElementCollection
IconSize: ConfigurationElement
In Config class exists this property:
public IQueryable<IconSize> IconSizes
{
get
{
IconSizesConfigSection configInfo = (IconSizesConfigSection)ConfigurationManager.GetSection("iconConfig");
return configInfo.IconSizes.OfType<IconSize>().AsQueryable<IconSize>();
}
}
IconSizes property returns IconSizesCollection which derives from ConfigurationElementCollection. In turn ConfigurationElementCollection derives from ICollection, IEnumerable.
In some another class I have such code:
var previewIconSize = Config.IconSizes.FirstOrDefault(c => c.Name == "AvatarSize");
Why in such case uses Deffered Execution?
Why initially it uses AsQueryable<IconSize>() for collection and then uses LINQ and Deffered Execution?
Is there any benefits compared with using simple List?
In these case, there is no practical benefit. Using IQueryable is helpful for cases when query rewriting/translation will optimize performance. You will actually incur decreased performance in the provided example.
One example of using IQueryable in a helpful way is the significant performance increase gained when lazily translating and evaluating queries against a database or web service. This will perform significantly better than the alternative of pulling massive result sets and applying query logic in active memory with a "simple List".
The way you can tell that using the IQueryable in your case is detrimental is that the collection is already loaded into memory, when you begin the query.
Both IEnumerable and IQueryable use deferred execution. The difference is that IQueryable is used to cross boundaries like database queries, entity framework queries or OData queries.
When an IQueryable is iterated over, the query is translated to the remote provider's idiom and executed there. When the response is received from the remote provider, it is translated to a local object representation.
Deferred Execution is good because your user may never use the result set and hence there would have been no point querying the data source.
There may be some LINQ methods your user can't use unless they cast the result to IQueryable which means you might restrict what they can do, or force them to cast/copy the list into something more useful.
If you use a List, then you're hard coding your solution to a List, do you care what the implementation of the collection is, does your user ... probably not as long as it supports the necessary interfaces.

Create my own LINQ collection

I'm looking for good tutorials on how to create LINQ accessors/APIs to my business classes. So that someone could eventually enter something like this in a program--
var potentialCustomers = from people in county
where people.NumberOfCats > 2
select people
I've used LINQ often enough with the .Net collections, but have never done it before on my own classes. Is it just a matter of implementing IEnumerable, or are there additional steps needed?
LINQ is an interesting beast.
Immediately IEnumerable<T> comes to mind when discussing LINQ. It seems that IEnumerable<T> is LINQ, but it is not. IEnumerable<T> is one implementation of the LINQ methods that allow LINQ queries to be written against objects that implement IEnumerable<T>.
Another implementation is IObservable<T> which powers the Microsoft's Reactive Extensions. This is a set of extensions that allow LINQ queries to be written against events (or streams of data). Nothing to do with IEnumerable<T>.
LINQ also can be written directly in your objects - it doesn't have to be extension methods at all.
For example, define classes A and B like so:
public class A
{
public B Select(Func<A, B> selector)
{
return selector(this);
}
}
public class B
{
public B(A a) { }
}
Now I can write this code:
B query =
from x in a
select new B(x);
It's LINQ, Jim, but not as we know it.
All of the LINQ operators can be defined this way. So long as the compiler gets to see methods with the right signature you're golden.
Having said this LINQ queries feel natural when working with a series of values - and hence this is why IEnumerable<T> and IObservable<T> are good examples of LINQ in action. But it certainly is possible to define LINQ against any type you like just by implementing the right methods.
You just need to implement IEnumerable interface in your class and then you can use LINQ.
Because LINQ is a set of extensions for IEnumerable objects

Performance of IQueryable versus Dictionary

I'm caching a whole bunch of static metadata in my app at startup. It's from a db and there are tons of Foreign Key relationships. I'm looking into the best way of modelling them.
I've just started off with LINQ.
It's easy for me to declare a class
public class AllData {
public static IQueryable<libm_ColPurpose> IQ_libm_ColPurpose = libm_ColPurpose.All();
public static IQueryable<libm_ColType> IQ_libm_ColType = libm_ColType.All();
...
(I'm using SubSonic 3 to generate my classes, but that's beside the point).
Then I can use the IQueryable<T> members to get access to anything I want, for example:
libm_ColType ct = AllData.IQ_libm_ColType.SingleOrDefault(x => x.ColTypeStr == this.DefaultJetColTypeStr);
Prior to using IQueryable I was using Dictionaries to store FK relationships, so to mimic the above code I'd code the following from a preexisting List<libm_ColType> list
Dictionary<string, libm_ColType> colTypeByColTypeStr = new Dictionary<string, libm_ColType>();
foreach (libm_ColType x in list) { rtn.Add(x.ColTypeStr, x); }
and then I could use
libm_ColType ct = colTypeByColTypeStr[this.DefaultJetColTypeStr];
OK, so finally we get to the question !
The Dictionary lookup by ID is extremely efficient, however the IQueryable solution is far more flexible and elegant.
I'm wondering how much of a performance hit I'm going to get using IQueryable. I suspect I am doing a linear scan of the list each time I call it, and that's really gonna add up over repeat calls if there are a lot of records involved.
It woul be great if I could identify unique-valued columns and have a hashtable generated and cached after the first lookup, but I suspect this is not gonna be part of the offering.
This is a bit of a dealbreaker for me regarding using LINQ.
Note (I'll repeat it again) that I'm NOT pulling data from a database, it's already in memory and I'm querying it there, so I'm only interesting in looking up the in-memory IQueryable<T>.
IQueryable represents a collection in a data-store, so you probably don't have the collections in memory. If you explicitly want in-memory collections, then I would go back to your dictionaries. Remember, this doesn't prevent you from using LINQ queries over the data.

EF4, Lambda, Repository pattern and DTOs

I have a semi complicated question regarding Entity Framework4, Lambda expressions, and Data Transfer Objects (DTO).
So I have a small EF4 project, and following established OO principles, I have a DTO to provide a layer of abstraction between the data consumers (GUI) and the data model.
VideoDTO = DTO with getters/setters, used by the GUI
VideoEntity = Entity generated by EF4
My question revolves around the use of the DTO by the GUI (and not having the GUI use the Entity at all), combined with a need to pass a lambda to the data layer. My data layer is a basic repository pattern with Add. Change, Delete, Get, GetList, etc.
Trying to implement a Find method with a signature like so:
public IEnumerable<VideoDTO> Find(Expression<Func<VideoEntity, bool>> exp)
...
_dataModel.Videos.Where(exp).ToList<Video>()
---
My problem/concern is the "exp" needing to be of type VideoEntity instead of VideoDTO. I want to preserve the separation of concerns so that the GUI does not know about the Entity objects. But if I try to pass in
Func<VideoDTO, bool>
I cannot then do a LINQ Where on that expression using the actual data model.
Is there a way to convert a Func<VideoDTO,bool> to a Func<VideoEntity, bool>
Ideally my method signature would accept Func<VideoDTO, bool> and that way the GUI would have no reference to the underlying data entity.
Is this clear enough? Thanks for your help
Thanks for the repliesto both of you.
I'll try the idea of defining the search criteria in an object and using that in the LINQ expression. Just starting out with both EF4 and L2S, using this as a learning project.
Thanks again!
In architectures like CQRS there isn't need for such a conversion at all cause read & write sides of app are separated.
But in Your case, You can't runaway from translation.
First of all - You should be more specific when defining repositories. Repository signature is thing You want to keep explicit instead of generic.
Common example to show this idea - can You tell what indexes You need in Your database when You look at Your repository signature (maybe looking at repository implementation, but certainly w/o looking at client code)? You can't. Cause it's too generic and client side can search by anything.
In Your example it's a bit better cause expression genericness is tied with dto instead of entity.
This is what I do (using NHibernate.Linq, but the idea remains)
public class Application{
public Project Project {get;set;}
}
public class ApplicationRepository{
public IEnumerable<Application> Search(SearchCriteria inp){
var c=Session.Linq<Application>();
var q=c.AsQueryable();
if(!string.IsNullOrEmpty(inp.Acronym))
q=q.Where(a=>a.Project.Acronym.Contains(inp.Acronym));
/*~20 lines of similar code snipped*/
return q.AsQueryable();
}
}
//used by client
public class SearchCriteria{
public string Acronym{get;set;}
/*some more fields that defines how we can search Applications*/
}
If You do want to keep Your expressions, one way would be to define dictionary manually like this:
var d=new Dictionary<Expression<Func<VideoDTO,object>>,
Expression<Func<VideoEntity,object>>{
{x=>x.DtoPropNumberOne,x=>x.EntityPropNumberOne} /*, {2}, {3}, etc.*/
};
And use it later:
//can You spot it?
//client does not know explicitly what expressions dictionary contains
_dataModel.Videos.Where(d[exp]).ToList<Video>();
//and I'm not 100% sure checking expression equality would actually work
If You don't want to write mapping dictionary manually, You will need some advanced techniques. One idea would be to translate dto expression to string and then back to entity expression. Here are some ideas (sorting related though) that might help. Expressions are quite complicated beasts.
Anyway - as I said, You should avoid this. Otherwise - You will produce really fragile code.
Perhaps your design goal is to prevent propagation of the data model entities to the client tier rather than to prevent a dependency between the presentation layer and data model. If viewed that way then there would be nothing wrong with the query being formed the way you state.
To go further you could expose the searchable fields from VideoEntity via an interface (IVideoEntityQueryFields) and use that as the type in the expression.
If you don't want to add an interface to your entities then the more complicated option is to use a VideoEntityQuery object and something that translates an Expression<Func<VideoEntityQuery,bool>> to an Expression<Func<VideoEntity,bool>>.

Do you ToList()?

Do you have a default type that you prefer to use in your dealings with the results of LINQ queries?
By default LINQ will return an IEnumerable<> or maybe an IOrderedEnumerable<>. We have found that a List<> is generally more useful to us, so have adopted a habit of ToList()ing our queries most of the time, and certainly using List<> in our function arguments and return values.
The only exception to this has been in LINQ to SQL where calling .ToList() would enumerate the IEnumerable prematurely.
We are also using WCF extensively, the default collection type of which is System.Array. We always change this to System.Collections.Generic.List in the Service Reference Settings dialog in VS2008 for consistency with the rest of our codebase.
What do you do?
ToList always evaluates the sequence immediately - not just in LINQ to SQL. If you want that, that's fine - but it's not always appropriate.
Personally I would try to avoid declaring that you return List<T> directly - usually IList<T> is more appropriate, and allows you to change to a different implementation later on. Of course, there are some operations which are only specified on List<T> itself... this sort of decision is always tricky.
EDIT: (I would have put this in a comment, but it would be too bulky.) Deferred execution allows you to deal with data sources which are too big to fit in memory. For instance, if you're processing log files - transforming them from one format to another, uploading them into a database, working out some stats, or something like that - you may very well be able to handle arbitrary amounts of data by streaming it, but you really don't want to suck everything into memory. This may not be a concern for your particular application, but it's something to bear in mind.
We have the same scenario - WCF communications to a server, the server uses LINQtoSQL.
We use .ToArray() when requesting objects from the server, because it's "illegal" for the client to change the list. (Meaning, there is no purpose to support ".Add", ".Remove", etc).
While still on the server, however, I would recommend that you leave it as it's default (which is not IEnumerable, but rather IQueryable). This way, if you want to filter even more based on some criteria, the filtering is STILL on the SQL side until evaluated.
This is a very important point as it means incredible performance gains or losses depending on what you do.
EXAMPLE:
// This is just an example... imagine this is on the server only. It's the
// basic method that gets the list of clients.
private IEnumerable<Client> GetClients()
{
var result = MyDataContext.Clients;
return result.AsEnumerable();
}
// This method here is actually called by the user...
public Client[] GetClientsForLoggedInUser()
{
var clients = GetClients().Where(client=> client.Owner == currentUser);
return clients.ToArray();
}
Do you see what's happening there? The "GetClients" method is going to force a download of ALL 'clients' from the database... THEN the Where clause will happen in the GetClientsForLoogedInUser method to filter it down.
Now, notice the slight change:
private IQueryable<Client> GetClients()
{
var result = MyDataContext.Clients;
return result.AsQueryable();
}
Now, the actual evaluation won't happen until ".ToArray" is called... and SQL will do the filtering. MUCH better!
In the Linq-to-Objects case, returning List<T> from a function isn't as nice as returning IList<T>, as THE VENERABLE SKEET points out. But often you can still do better than that. If the thing you are returning ought to be immutable, IList is a bad choice because it invites the caller to add or remove things.
For example, sometimes you have a method or property that returns the result of a Linq query or uses yield return to lazily generate a list, and then you realise that it would be better to do that the first time you're called, cache the result in a List<T> and return the cached version thereafter. That's when returning IList may be a bad idea, because the caller may modify the list for their own purposes, which will then corrupt your cache, making their changes visible to all other callers.
Better to return IEnumerable<T>, so all they have is forward iteration. And if the caller wants rapid random access, i.e. they wish they could use [] to access by index, they can use ElementAt, which Linq defines so that it quietly sniffs for IList and uses that if available, and if not it does the dumb linear lookup.
One thing I've used ToList for is when I've got a complex system of Linq expressions mixed with custom operators that use yield return to filter or transform lists. Stepping through in the debugger can get mighty confusing as it jumps around doing lazy evaluation, so I sometimes temporarily add a ToList() to a few places so that I can more easily follow the execution path. (Although if the things you are executing have side-effects, this can change the meaning of the program.)
It depends if you need to modify the collection. I like to use an Array when I know that no one is going to add/delete items. I use a list when I need to sort/add/delete items. But, usually I just leave it as IEnumerable as long as I can.
If you don't need the added features of List<>, why not just stick with IQueryable<> ?!?!?! Lowest common denominator is the best solution (especially when you see Timothy's answer).

Resources