Test LINQ to SQL expression - linq

I am writing an application that works with MS SQL database via LINQ to SQL. I need to perform filtering sometimes, and occasionally my filtering conditions are too complicated to be translated into SQL query. While I am trying to make them translatable, I want my application to at least work, though slow sometimes.
LINQ to SQL data model is hidden inside repositories, and I do not want to provide several GetAll method overloads for different cases and be aware of what overload to use on upper levels. So I want to test my expression inside repository to be translatable and, if no, perform in-memory query against the whole data set instead of throwing NotSupportedException on query instantiating.
This is what I have now:
IQueryable<TEntity> table = GetTable<TEntity>();
IQueryable<TEntity> result;
try
{
result = table.Where(searchExpression);
//this will test our expression
//consuming as little resources as possible (???)
result.FirstOrDefault();
}
catch (NotSupportedException)
{
//trying to perform in-memory search if query could not be constructed
result = table
.AsEnumerable()
.Where(searchExpression.Compile())
.AsQueryable();
}
return result;
searchExpression is Expression<Func<TEntity, bool>>
As you see, I am using FirstOrDefault to try to instantiate the query and throw the exception if it cannot be instantiated. However, it will perform useless database call when the expression is good. I could use Any, Count or other method, and it may well be a bit less expensive then FirstOrDefault, but still all methods that come to my mind make a costly trip to database, while all I need is to test my expression.
Is there any alternative way to say whether my expression is 'good' or 'bad', without actual database call?
UPDATE:
Or, more generally, is there a way to tell LINQ to make in-memory queries when it fails to construct SQL, so that this testing mechanism would not be needed at all?

Instead of
result.FirstOrDefault();
would it be sufficient to use
string sqlCommand = dataContext.GetCommand(result).CommandText;
?
If the expression does not generate valid Sql, this should throw a NotSupportedException, but it does not actually execute the sqlCommand.

I think this will solve your problem:
IQueryable<TEntity> table = GetTable<TEntity>();
IQueryable<TEntity> result;
try
{
return table.Where(searchExpression).ToList();
}
catch (NotSupportedException)
{
//trying to perform in-memory search if query could not be constructed
return table
.AsEnumerable()
.Where(searchExpression.Compile())
.ToList();
}
So the method returns is the expression is converted to valid SQL. Otherwise it catches the exception and runs the query in memory. This should work but it doesn't answer your question if it's possible to check if a specific searchExpression can be converted. I don't think such a thing exists.

Related

dotnet core azure documentdb linq lambda query not running query on server

I'm running dotnet core and accessing documentdb.
When I try to run a query using a linq where clause it returns but it takes a long time and doesn't seem to filter on the server. I was able to resolve this by using the SqlQuerySpec to run the query and it now appears to run the query criteria on the server.
Is this a known issue or am I missing something?
The one that doesn't work:
var query = _client.CreateDocumentQuery<T>(documentCollection.DocumentsLink).Where(criteria);
return query.ToList();
criteria is of type
Func<T, bool> criteria
The one that does work:
var documentQuery = _client.CreateDocumentQuery<T>(UriFactory.CreateDocumentCollectionUri(_databaseName, collectionName), query).AsDocumentQuery();
List<T> results = new List<T>();
while (documentQuery.HasMoreResults)
{
results.AddRange(await documentQuery.ExecuteNextAsync<T>());
}
return results;
query is of type
SqlQuerySpec query
Is this a feature that is lagging behind in dotnet core's implementation of the documentdb sdk vs the standard .NET package?
The issue is that you are using Func<T, bool> for criteria. The one that you're using is for IEnumerable. By design IEnumerable will do in-memory filtering (client-side).
CreateDocumentQuery.Where() actually returns an IQueryable. You need to change your criteria type to Expression<Func<T, bool>>as this is what is expected by CreateDocumentQuery.
When you use an Expression, your LINQ expression is converted to the database specific SQL query and will be executed on the server.
Uri documentCollectionUri = UriFactory.CreateDocumentCollectionUri(DatabaseId, CollectionId);
var query = client.CreateDocumentQuery<T>(documentCollectionUri)
.Where(predicate)
.AsDocumentQuery();
List<T> results = new List<T>();
while (documentQuery.HasMoreResults)
{
results.AddRange(await documentQuery.ExecuteNextAsync<T>());
}
return results;
Where predicate is Expression<Func<T, bool>>
One important thing to remember: You can only use those LINQ extentions that have an equivalent function in the DocumentDb's SQL language. For example, you can use Take() but you cannot use Skip(), you cannot use Array contains on specific nested fields, etc.
When I try to run a query using a linq where clause it returns but it takes a long time and doesn't seem to filter on the server.
If we capture the request when run var query = _client.CreateDocumentQuery<T>(documentCollection.DocumentsLink).Where(criteria);, we could find it performs a GET on the documents resource of a particular collection
GET https://{databaseaccount}.documents.azure.com/dbs/{db-id}/colls/{coll-id}/docs to get a list of documents under the collection, which does not really query and filter on DocumentDB server.
And Where method is used to filter a sequence of values based on a predicate, which happens on the client (not pass the search condition to DocumentDB server).

Using local methods for querying data from DocumentDB

I have the following function in my base repository
public IEnumerable<T> GetAll(ClaimsPrincipal user)
{
return client.CreateDocumentQuery<T>(collectionLink)
.Where(doc => doc.Type == DocType)
.AsEnumerable()
.Where(doc =>
doc.Owner == user.Id()
|| user.InGroup(doc.Owner)
|| doc.Public);
}
I have two extension methods on the ClaimsPrincipal class. Id() just returns .FindFirst("user_id").Value and .InGroup(<id>) checks if the user has a group-membership to the group that owns the document.
However, am I correct in assuming that once I call .AsEnumerable() the query goes to the database with only the first where-clause, returns everything it matches, and then does the second where-clause on the client side?
However, am I correct in assuming that once I call .AsEnumerable() the query goes to the database
Not quite.
When this method returns, it won't have hit the database at all. AsEnumerable() doesn't force the query to execute - it's basically just a cast to IEnumerable<T>.
You're right in saying the first Where is performed at the database and the second Where is performed client-side, but "returns everything it matches" suggests that's done in bulk - which it might be, but it may be streamed as well. The client-side Where will certainly stream, but whether the IQueryable<T> implementation fetches everything eagerly is an implementation detail.
If you're really only interested in which filters are in the database and which are local though, you're correct.

Does this extension method efficiently materialize my IQueryable?

So I recently discovered that you can force Entity Framework not to translate your projection into SQL by specifying a Func<T, TResult> to the .Select() extension method rather than an expression. This is useful when you want to transform your queried data, but that transformation should happen in your code rather than on the database.
For example, when using EF5's new Enum support and trying to project that to a string property in a DTO, whereas this would fail:
results.Select(r => new Dto { Status = r.Status.ToString() })
this would work:
results.Select(new Func<Record, Dto>(r => new Dto { Status = r.Status.ToString() }));
because in the first (expression) case, EF can't figure out how to translate Status.ToString() to something the SQL database could perform, but as per this article Func predicates aren't translated.
Once I had this working, it wasn't much of a leap to create the following extension method:
public static IQueryable<T> Materialize<T>(this IQueryable<T> q)
{
return q.Select(new Func<T, T>(t => t)).AsQueryable();
}
So my question is - are there any pitfalls I should be wary of when using this? Is there a performance impact - either in injecting this do-nothing projection into the query pipeline or by causing EF to not send the .Where() clause to the server and thereby send all the results over the wire?
The intention is to still use a .Where() method to filter the results on the server, but then to use .Materialize() before .Select() so that the provider doesn't try to translate the projection to SQL Server:
return Context.Record
.Where(r => // Some filter to limit results)
.Materialize()
.Select(r => // Some projection to DTO, etc.);
Simply using AsEnumerable should do the same:
return Context.Record
.Where(r => // Some filter to limit results)
.AsEnumerable() // All extension methods now accept Func instead of Expression
.Select(r => // Some projection to DTO, etc.);
There is also no reason in your Materialize method to go back to IQueryable because it is not a real IQueryable translated to another query any more. It is just IEnumerable.
In terms of performance you should be OK. Everything before materialization is evaluated in the database and everything after materialization in your code. Moreover in both your and my example query has still deferred execution - it is not executed until something enumerates the query.
There is one problem though: number of columns fetched to client. In first case, it would be something like select Status from Record and in another select Status, field2, field3, field4 from Record

Linq to NHibernate generating 3,000+ SQL statements in one request!

I've been developing a webapp using Linq to NHibernate for the past few months, but haven't profiled the SQL it generates until now. Using NH Profiler, it now seems that the following chunk of code hits the DB more than 3,000 times when the Linq expression is executed.
var activeCaseList = from c in UserRepository.GetCasesByProjectManagerID(consultantId)
where c.CompletionDate == null
select new { c.PropertyID, c.Reference, c.Property.Address, DaysOld = DateTime.Now.Subtract(c.CreationDate).Days, JobValue = String.Format("£{0:0,0}", c.JobValue), c.CurrentStatus };
Where the Repository method looks like:
public IEnumerable<Case> GetCasesByProjectManagerID(int projectManagerId)
{
return from c in Session.Linq<Case>()
where c.ProjectManagerID == projectManagerId
select c;
}
It appears to run the initial Repository query first, then iterates through all of the results checking to see if the CompletionDate is null, but issuing a query to get c.Property.Address first.
So if the initial query returns 2,000 records, even if only five of them have no CompletionDate, it still fires off an SQL query to bring back the address details for the 2,000 records.
The way I had imagined this would work, is that it would evaluate all of the WHERE and SELECT clauses and simply amalgamate them, so the inital query would be like:
SELECT ... WHERE ProjectManager = #p1 AND CompleteDate IS NOT NULL
Which would yield 5 records, and then it could fire the further 5 queries to obtain the addresses. Am I expecting too much here, or am I simply doing something wrong?
Anthony
Change the declaration of GetCasesByProjectManagerID:
public IQueryable<Case> GetCasesByProjectManagerID(int projectManagerId)
You can't compose queries with IEnumerable<T> - they're just sequences. IQueryable<T> is specifically designed for composition like this.
Since I can't add a comment yet. Jon Skeet is right you'll want to use IQueryable, this is allows the Linq provider to Lazily construct the SQL. IEnumerable is the eager version.

Using LINQ Expression Instead of NHIbernate.Criterion

If I were to select some rows based on certain criteria I can use ICriterion object in NHibernate.Criterion, such as this:
public List<T> GetByCriteria()
{
SimpleExpression newJobCriterion =
NHibernate.Criterion.Expression.Eq("LkpStatu", statusObject);
ICriteria criteria = Session.GetISession().CreateCriteria(typeof(T)).SetMaxResults(maxResults);
criteria.Add(newJobCriterion );
return criteria.List<T>();
}
Or I can use LINQ's where clause to filter what I want:
public List<T> GetByCriteria_LINQ()
{
ICriteria criteria = Session.GetISession().CreateCriteria(typeof(T)).SetMaxResults(maxResults);
return criteria.Where(item=>item.LkpStatu=statusObject).ToList();
}
I would prefer the second one, of course. Because
It gives me strong typing
I don't need to learn yet-another-syntax in the form of NHibernate
The issue is is there any performance advantage of the first one over the second one? From what I know, the first one will create SQL queries, so it will filter the data before pass into the memory. Is this kind of performance saving big enough to justify its use?
As usual it depends. First note that in your second snippet there is .List() missing right after return criteria And also note that you won't get the same results on both examples. The first one does where and then return top maxResults, the second one however first selects top maxResults and then does where.
If your expected result set is relatively small and you are likely to use some of the results in lazy loads then it's actually better to take the second approach. Because all entities loaded through a session will stay in its first level cache.
Usually however you don't do it this way and use the first approach.
Perhaps you wanted to use NHibernate.Linq (located in Contrib project ). Which does linq translation to Criteria for you.
I combine the two and made this:
var crit = _session.CreateCriteria(typeof (T)).SetMaxResults(100);
return (from x in _session.Linq<T>(crit) where x.field == <something> select x).ToList();

Resources