Does this extension method efficiently materialize my IQueryable? - linq

So I recently discovered that you can force Entity Framework not to translate your projection into SQL by specifying a Func<T, TResult> to the .Select() extension method rather than an expression. This is useful when you want to transform your queried data, but that transformation should happen in your code rather than on the database.
For example, when using EF5's new Enum support and trying to project that to a string property in a DTO, whereas this would fail:
results.Select(r => new Dto { Status = r.Status.ToString() })
this would work:
results.Select(new Func<Record, Dto>(r => new Dto { Status = r.Status.ToString() }));
because in the first (expression) case, EF can't figure out how to translate Status.ToString() to something the SQL database could perform, but as per this article Func predicates aren't translated.
Once I had this working, it wasn't much of a leap to create the following extension method:
public static IQueryable<T> Materialize<T>(this IQueryable<T> q)
{
return q.Select(new Func<T, T>(t => t)).AsQueryable();
}
So my question is - are there any pitfalls I should be wary of when using this? Is there a performance impact - either in injecting this do-nothing projection into the query pipeline or by causing EF to not send the .Where() clause to the server and thereby send all the results over the wire?
The intention is to still use a .Where() method to filter the results on the server, but then to use .Materialize() before .Select() so that the provider doesn't try to translate the projection to SQL Server:
return Context.Record
.Where(r => // Some filter to limit results)
.Materialize()
.Select(r => // Some projection to DTO, etc.);

Simply using AsEnumerable should do the same:
return Context.Record
.Where(r => // Some filter to limit results)
.AsEnumerable() // All extension methods now accept Func instead of Expression
.Select(r => // Some projection to DTO, etc.);
There is also no reason in your Materialize method to go back to IQueryable because it is not a real IQueryable translated to another query any more. It is just IEnumerable.
In terms of performance you should be OK. Everything before materialization is evaluated in the database and everything after materialization in your code. Moreover in both your and my example query has still deferred execution - it is not executed until something enumerates the query.

There is one problem though: number of columns fetched to client. In first case, it would be something like select Status from Record and in another select Status, field2, field3, field4 from Record

Related

LINQ to Entities seemingly odd behavior

I have this Linq query that translates very oddly to SQL. I get the correct results but there must be a better way. So question 1 is:
Why is it that in SQL I get no group by, no count and all of the
columns are returned instead of just 2; and then the results in C# are correct? (I checked with profiler).
and question 2 is:
I would like to modify the query slightly so that I get also the
results where count is 0. At the moment I only get where counts > 0
because of the group by.
LINQ:
List<Tuple<string, int>> countPerType = db1.Audits
.OrderBy(p => p.CreatedBy)
.GroupBy(o => new { o.Type, o.CreatedBy })
.ToList()
.Select(g => new Tuple<string, int>(g.Select(f => f.CreatedBy + ',' + f.Type).FirstOrDefault(),
(int?)g.Count() ?? 0))
.ToList();
Note that if I remove the .ToList() in the middle, I get exception "only parameterless constructors and initializers are supported in linq to entities".
Thanks for your input
You run into several problems. I think the cause of this is that you aren't aware of the difference between queries that are AsEnumerable and queries that are AsQueryable.
AsEnumerable queries contain all information to enumerate over the elements in the query. The query will be executed by your process.
An AsQueryable query, contains a Expression and a Provider. The Provider knows who will execute the query, and how to communicate with this executer. Quite often the executer will be a database, but it can be other things, like internet queries, jswon files etc.
In your case the executer will be a database, the language will be SQL.
When the GetEnumerator() function of your IQueryable is called, the Provider is ordered to translate the Expression into the language that the executor knows. The translated query is sent to the executor and the returned data is put into an Enumerator (not IEnumerable!)
Of course SQL does not know what a System.Tuple is, nor does it know functions like String.operator+
Therefore your Provider can't translate your expression into SQL. That is the reason you have to do your first ToList()
You can't make queries as IQueryable with any of your own functions, and only a limited amount of .NET functions.
See this list of supported and unsupported Linq methods
It is not advise to use ToList() in this stadium of your query, because it enumerates all elements of your sequence, will in fact you only need an enumerator. It could be that during the rest of your query you'd only want a few elements. In that case it would be a waste to enumerate over all of them to create a list, and then to enumerate again to do the rest of your LINQ.
Instead of ToList() use Enumerable.AsEnumerable(). This will bring all data of the query to local memory and create an IEnumerable of it: the elements are not enumerated yet. This will allow you to call local functions with the rest of your query.
Another problem is that you transport way more data to local memory than you plan to use. One of the slower parts of database queries is the transport of data to your process. You should minimize the amount of data.
You took all Audits, and created groups of Audits that have the same values for (Type, CreatedBy). In other words: all Audits in the same group have the same values for (Type, CreatedBy). This value is also the Key of the group.
You don't want all Audits locally, you only want the Key of the group and the number of elements of this group (= the number of audits that have (Type, CreatedBy) equal to the key.
This is the only data you need to transport to local memory: Type, CreatedBy and the number of audits in the group:
var result = db1.Audits.GroupBy(o => new { o.Type, o.CreatedBy })
.Select(group => new
{
Type = group.Key.Type,
CreatedBy = group.Key.CreatedBy,
AuditCount = group.Count(),
})
.OrderBy(item => item.CreatedBy)
// the data that is left is the data you need locally
// bring to local memory:
.AsEnumerable()
// if you want you can put Type and CreatedBy into one string
.Select(item => new
{
AuditType = item.Type + item.CreatedBy,
AuditCount = item.AuditCount,
});
I chose not to put the result in a Tuple, because you would lose the help from the compiler if you mix up fields. But if you really want to suit yourself.

Cannot form a select statement for query in silverlight

I want to do something like
from table1
where col5="abcd"
select col1
I did like
query_ = From g In DomainService.GetGEsQuery Select New GE With {.Desc = g.codDesc}
"This cause a runtime error, i tried various combinations but failed"
please help.
I'm assuming your trying to do this on the client side. If so you could do something like this
DomainService.Load(DomainService.GetGEsQuery().Where(g => g.codDesc == "something"), lo =>
{
if (lo.HasError == false)
{
List<string> temp = lo.Entities.Select(a => a.Name).ToList();
}
}, null);
you could also do this in the server side (which i would personally prefer) like this
public IQueryable<string> GetGEStringList(string something)
{
return this.ObjectContext.GE.Where(g => g.codDesc == something).Select(a => a.Name);
}
Hope this helps
DomainService.GetGEsQuery() returns an IQueryable, that is only useful in a subsequent asynchronous load. Your are missing the () on the method call, but that is only the first problem.
You can apply filter operations to the query returned using Where etc, but it still needs to be passed to the Load method of your domain context (called DomainService in your example).
The example Jack7 has posted shows an anonymous callback from the load method which then accesses the results inside the load object lo and extracts just the required field with another query. Note that you can filter the query in RIA services, but not change the basic return type (i.e. you cannot filter out unwanted columns on the client-side).
Jack7's second suggestion to implement a specific method server-side, returning just the data you want, is your best option.

Test LINQ to SQL expression

I am writing an application that works with MS SQL database via LINQ to SQL. I need to perform filtering sometimes, and occasionally my filtering conditions are too complicated to be translated into SQL query. While I am trying to make them translatable, I want my application to at least work, though slow sometimes.
LINQ to SQL data model is hidden inside repositories, and I do not want to provide several GetAll method overloads for different cases and be aware of what overload to use on upper levels. So I want to test my expression inside repository to be translatable and, if no, perform in-memory query against the whole data set instead of throwing NotSupportedException on query instantiating.
This is what I have now:
IQueryable<TEntity> table = GetTable<TEntity>();
IQueryable<TEntity> result;
try
{
result = table.Where(searchExpression);
//this will test our expression
//consuming as little resources as possible (???)
result.FirstOrDefault();
}
catch (NotSupportedException)
{
//trying to perform in-memory search if query could not be constructed
result = table
.AsEnumerable()
.Where(searchExpression.Compile())
.AsQueryable();
}
return result;
searchExpression is Expression<Func<TEntity, bool>>
As you see, I am using FirstOrDefault to try to instantiate the query and throw the exception if it cannot be instantiated. However, it will perform useless database call when the expression is good. I could use Any, Count or other method, and it may well be a bit less expensive then FirstOrDefault, but still all methods that come to my mind make a costly trip to database, while all I need is to test my expression.
Is there any alternative way to say whether my expression is 'good' or 'bad', without actual database call?
UPDATE:
Or, more generally, is there a way to tell LINQ to make in-memory queries when it fails to construct SQL, so that this testing mechanism would not be needed at all?
Instead of
result.FirstOrDefault();
would it be sufficient to use
string sqlCommand = dataContext.GetCommand(result).CommandText;
?
If the expression does not generate valid Sql, this should throw a NotSupportedException, but it does not actually execute the sqlCommand.
I think this will solve your problem:
IQueryable<TEntity> table = GetTable<TEntity>();
IQueryable<TEntity> result;
try
{
return table.Where(searchExpression).ToList();
}
catch (NotSupportedException)
{
//trying to perform in-memory search if query could not be constructed
return table
.AsEnumerable()
.Where(searchExpression.Compile())
.ToList();
}
So the method returns is the expression is converted to valid SQL. Otherwise it catches the exception and runs the query in memory. This should work but it doesn't answer your question if it's possible to check if a specific searchExpression can be converted. I don't think such a thing exists.

Entity Framework LINQ Query using Custom C# Class Method - Once yes, once no - because executing on the client or in SQL?

I have two Entity Framework 4 Linq queries I wrote that make use of a custom class method, one works and one does not:
The custom method is:
public static DateTime GetLastReadToDate(string fbaUsername, Discussion discussion)
{
return (discussion.DiscussionUserReads.Where(dur => dur.User.aspnet_User.UserName == fbaUsername).FirstOrDefault() ?? new DiscussionUserRead { ReadToDate = DateTime.Now.AddYears(-99) }).ReadToDate;
}
The linq query that works calls a from after a from, the equivalent of SelectMany():
from g in oc.Users.Where(u => u.aspnet_User.UserName == fbaUsername).First().Groups
from d in g.Discussions
select new
{
UnReadPostCount = d.Posts.Where(p => p.CreatedDate > DiscussionRepository.GetLastReadToDate(fbaUsername, p.Discussion)).Count()
};
The query that does not work is more like a regular select:
from d in oc.Discussions
where d.Group.Name == "Student"
select new
{
UnReadPostCount = d.Posts.Where(p => p.CreatedDate > DiscussionRepository.GetLastReadToDate(fbaUsername, p.Discussion)).Count(),
};
The error I get is:
LINQ to Entities does not recognize the method 'System.DateTime GetLastReadToDate(System.String, Discussion)' method, and this method cannot be translated into a store expression.
My question is, why am I able to use my custom GetLastReadToDate() method in the first query and not the second? I suppose this has something to do with what gets executed on the db server and what gets executed on the client? These queries seem to use the GetLastReadToDate() method so similarly though, I'm wondering why would work for the first and not the second, and most importantly if there's a way to factor common query syntax like what's in the GetLastReadToDate() method into a separate location to be reused in several different other LINQ queries.
Please note all these queries are sharing the same object context.
I think your better of using a Model Defined Function here.
Define a scalar function in your database which returns a DateTime, pass through whatever you need, map it on your model, then use it in your LINQ query:
from g in oc.Users.Where(u => u.aspnet_User.UserName == fbaUsername).First().Groups
from d in g.Discussions
select new
{
UnReadPostCount = d.Posts.Where(p => p.CreatedDate > myFunkyModelFunction(fbaUsername, p.Discussion)).Count()
};
and most importantly if there's a way to factor common query syntax like what's in the GetLastReadToDate() method into a separate location to be reused in several different places LINQ queries.
A stored procedure would probably be one way to store that 'common query syntax"...EF, at least 4.0, works very nicely with SP's.

Linq Containsfunction problem

I use Ria Service domainservice for data query.
In My database, there is a table People with firstname, lastname. Then I use EF/RIA services for data processing.
Then I create a Filter ViewModel to capture user inputs, based it the input, I construct a linq Query to access data.
At server side, the default DomainService query for person is:
public IQueryable<Person> GetPerson()
{
return this.Context.Person;
}
At client side, the linq Query for filter is something like(I use Contains function here):
if (!String.IsNullOrEmpty(this.LastName))
q = q.Where(p => (p.LastName.Contains(this.LastName)));
The generated linq query is something like(when debugging,I got it):
MyData.Person[].Where(p => (p.LastName.Contains(value(MyViewModel.PersonFilterVM).LastName) || p.Person.LegalLastName.Contains(value(MyViewModel.PersonFilterVM).LastName)))
When I run the app, I put "Smith" for last name for search, but the result is totally irrelevant with "Smith"!
How to fix it?
I'm guessing here as to what your error is so this might not work for you.
In your 2nd code snippet you do the following.
q = q.Where(p => (p.LastName.Contains(this.LastName)));
This is where I think your error is. Linq does not evaluate the where clause until you iterate over it. Try changing the line to the following.
qWithData = q.Where(p => (p.LastName.Contains(this.LastName))).ToList();
The .ToList() call will load the query with data.
When you check in the debugger, does value(MyViewModel.PersonFilterVM).LastName evaluate to Smith at the time the query is resolved?
Recall that queries are not resolved until they are enumerated.

Resources