Linq where keyword vs. Where extension and Expression parameters - linq

Passing in an Expression to a Linq query behaves differently depending on syntax used, and I wonder why this is the case.
Let's say I have this very generic function
private IEnumerable<Company>
GetCompanies(Expression<Func<Company, bool>> whereClause)
The following implementation works as expected
private IEnumerable<Company>
GetCompanies(Expression<Func<Company, bool>> whereClause)
{
return (from c in _ctx.Companies.Where(whereClause) select c);
}
But this next implementation does not compile
(Delegate 'System.Func' does not take 1 arguments)
private IEnumerable<Company>
GetCompanies(Expression<Func<Company, bool>> whereClause)
{
return (from c in _ctx.Companies where whereClause select c);
}
Obviously I can just use the first syntax, but I was just wondering why the compiler does not treat the where keyword the same as the Where extension?
Thanks,
Thomas

The syntax for a query expression involving a where clause is (simplifying the complete grammar)
from identifier in expression where boolean-expression select expression
whereClause is not a boolean expression. To recitify this, you have to say
from c in _ctx.Companies where whereClause.Compile()(c) select c;
Note that if whereClause were a Func<Company, bool> you could get away with
from c in _ctx.Companies where whereClause(c) select c;
Note that
from x in e where f
is translated mechanically by the compiler into
(from x in e).Where(x => f)
I say mechanically because it performs this translation without doing any semantic analysis to check validity of the method calls etc. That stage comes later after all query expressions have been translated to LINQ method-invocation expressions.
In particular,
from c in _ctx.Companies where whereClause select c
is translated to
_ctx.Companies.Where(c => whereClause).Select(c)
which is clearly nonsensical.
The reason that
from c in _ctx.Companies.Where(whereClause) select c
is legit is because IEnumerable<Company>.Where has an overload accepting a Func<Company, bool> and there is an implicit conversion from an Expression<Func<Company, bool>> to a Func<Company, bool>.

You can actually shorten the whole thing to:
private IEnumerable<Company>
GetCompanies(Expression<Func<Company, bool>> whereClause)
{
return _ctx.Companies.Where(whereClause);
}
When you use the LINQ syntax, the code in the where clause is translated into an Expression<>, which represents a code tree. When you accept Expression<Func<Customer, bool>>, you are saying that your method accepts a code tree, which is converted from C# code by the compiler.
Since you already have the code tree, you have to pass it directly to the Where() method, rather than using LINQ syntax.

The difference is that in sql-like where it expects expression that evaluates to bool. But in Where method the type of expression can be the delegate.
to force second to work you can change to whereClause.Compile()(c) or change parameter to Func<Company, bool> whereClause and whereClause(c)

Related

why is LINQ allowed to work as part of the language (it has spaces in the statement)

Why is LINQ allowed to have spaces in the statement? An example statement:
var questions = from item in db.questions
select item;
As a programmer, we cannot create functions or methods with spaces in them, or anything that resembles the LINQ syntax. Is this something that is just specially parsed by the language? Would there be any way to let programmers make up their own LINQ-like statements?
Because they are "keywords" (technically they are "contextual keywords". They are "keywords" only in certain places) :-) Look here: Query Keywords (C# Reference) and C# Keywords
Why can you write public static void with spaces? Because they are keywords :-)
And no, you can't add new keywords to C#.
(but note that you can use the LINQ syntax on non-IEnumerable/IQueryable objects. LINQ syntax is converted "blindly" to specific method names. The compiler doesn't check if they are really IEnumerable<T> or IQueryable<T>)
Try this:
class Test
{
public Test Where(Func<Test, bool> predicate)
{
Console.WriteLine("Doing the Where");
return this;
}
public T Select<T>(Func<Test, T> action)
{
Console.WriteLine("Doing the Select");
return action(this);
}
}
var res = from p in new Test() where p != null select new Test();
It's syntactic sugar that the compiler understands. You can't change the compiler so can't do the same I'm afraid

F# Power Pack Linq Issue

I have a simple function that makes use of the F# power pack to convert a quotation into a linq expression. The function is:
let toLinq (exp : Expr<'a -> 'b>) =
let linq = exp.ToLinqExpression()
let call = linq :?> MethodCallExpression
let lambda = call.Arguments.[0] :?> LambdaExpression
Expression.Lambda<Func<'a, 'b>>(lambda.Body, lambda.Parameters)
I use this function to create expressions that are consumed by a C# library that uses linq to sql to query a database. For example I might build an expression like:
let test = toLinq (<#fun u -> u.FirstName = "Bob"#> : Expr<Account->bool>)
and pass it to a method like:
public IEnumerable<T> Find(Expression<Func<T, bool> predicate)
{
var result = Table.OfType<T>();
result = result.Where(predicate)
var resultArray = result.ToArray();
return resultArray;
}
This was working as designed in verion 1.9.9.9 of the power pack. However it no longer works in the latest version of the power pack. The error I recieve is Method 'Boolean GenericEqualityIntrinsic[String](System.String, System.String)' has no supported translation to SQL.
I took a look at the changes to the power pack and it seems that the linq expression that is built using the new version makes use of GenericEqualityIntrinsic for comparing the property's value with the constant, whereas in version 1.9.9.9 it made use of String.op_Equality for comparison.
Is this a correct understanding of the issue? How do I make use of the new version of the power pack to convert quotations to linq expressions that can be consumed by a c# library that uses linq to sql?
Does explicitly calling
System.String.op_Equality(s1,s2)
work?
You can try the quotation as:
<#fun u -> u.FirstName.Equals("Bob")#>

MemberExpression to MemberExpression[]

The objective is to get an array of MemberExpressions from two LambdaExpressions. The first is convertible to a MethodCallExpression that returns the instance of an object (Expression<Func<T>>). The second Lambda expression would take the result of the compiled first expression and return a nested member (Expression<Func<T,TMember>>). We can assume that the second Lambda expression will only make calls to nested properties, but may do several of these calls.
So, the signature of the method I am trying to create is :
MemberExpression[] GetMemberExpressionArray<T,TValue>(Expression<Func<T>> instanceExpression, Expression<Func<T,TValue>> nestedMemberExpression)
where nestedMemberExpression will be assumed to take an argument of the form
parent => parent.ChildProperty.GrandChildProperty
and the resulting array represents the MemberAccess from parent to ChildProperty and from the value of ChildProperty to GrandChildProperty.
I have already returned the last MemberExpression using the following extension method.
public static MemberExpression GetMemberExpression<T, TValue>(Expression<Func<T, TValue>> expression)
{
if (expression == null)
{
return null;
}
if (expression.Body is MemberExpression)
{
return (MemberExpression)expression.Body;
}
if (expression.Body is UnaryExpression)
{
var operand = ((UnaryExpression)expression.Body).Operand;
if (operand is MemberExpression)
{
return (MemberExpression)operand;
}
if (operand is MethodCallExpression)
{
return ((MethodCallExpression)operand).Object as MemberExpression;
}
}
return null;
}
Now, I know there are several ways to accomplish this. The most immediately intuitive to me would be to loop through the .Expression property to get the first expression and capture references to each MemberExpression along the way. This may be the best way to do it, but it may not. I am not extraordinarily familiar with the performance costs I get from using expressions like this. I know a MemberExpression has a MemberInfo and that reflection is supposed to hurt performance.
I've tried to search for information on expressions, but my resources have been very limited in what I've found.
I would appreciate any advice on how to accomplish this task (and this type of task, in general) with optimal performance and reliability.
I'm not sure why this has been tagged performance, but the easiest way I can think of to extract member-expressions from a tree is to subclass ExpressionVisitor. This should be much simpler than manually writing the logic to 'expand' different types of expressions and walk the tree.
You'll probably have to override the VisitMember method so that:
Each member-expression is captured.
Its children are visited.
I imagine that would look something like:
protected override Expression VisitMember(MemberExpression node)
{
_myListOfMemberExpressions.Add(node);
return base.VisitMember(node);
}
I'm slightly unclear about the remainder of your task; it appears like you want to rewrite parameter-expressions, in which case you might want to look at this answer from Marc Gravell.

How do I combine LINQ expressions into one?

I've got a form with multiple fields on it (company name, postcode, etc.) which allows a user to search for companies in a database. If the user enters values in more than one field then I need to search on all of those fields. I am using LINQ to query the database.
So far I have managed to write a function which will look at their input and turn it into a List of expressions. I now want to turn that List into a single expression which I can then execute via the LINQ provider.
My initial attempt was as follows
private Expression<Func<Company, bool>> Combine(IList<Expression<Func<Company, bool>>> expressions)
{
if (expressions.Count == 0)
{
return null;
}
if (expressions.Count == 1)
{
return expressions[0];
}
Expression<Func<Company, bool>> combined = expressions[0];
expressions.Skip(1).ToList().ForEach(expr => combined = Expression.And(combined, expr));
return combined;
}
However this fails with an exception message along the lines of "The binary operator And is not defined for...". Does anyone have any ideas what I need to do to combine these expressions?
EDIT: Corrected the line where I had forgotten to assign the result of and'ing the expressions together to a variable. Thanks for pointing that out folks.
You can use Enumerable.Aggregate combined with Expression.AndAlso. Here's a generic version:
Expression<Func<T, bool>> AndAll<T>(
IEnumerable<Expression<Func<T, bool>>> expressions) {
if(expressions == null) {
throw new ArgumentNullException("expressions");
}
if(expressions.Count() == 0) {
return t => true;
}
Type delegateType = typeof(Func<,>)
.GetGenericTypeDefinition()
.MakeGenericType(new[] {
typeof(T),
typeof(bool)
}
);
var combined = expressions
.Cast<Expression>()
.Aggregate((e1, e2) => Expression.AndAlso(e1, e2));
return (Expression<Func<T,bool>>)Expression.Lambda(delegateType, combined);
}
Your current code is never assigning to combined:
expr => Expression.And(combined, expr);
returns a new Expression that is the result of bitwise anding combined and expr but it does not mutate combined.
EDIT: Jason's answer is now fuller than mine was in terms of the expression tree stuff, so I've removed that bit. However, I wanted to leave this:
I assume you're using these for a Where clause... why not just call Where with each expression in turn? That should have the same effect:
var query = ...;
foreach (var condition in conditions)
{
query = query.Where(condition);
}
Here we have a general question about combining Linq expressions. I have a general solution for this problem. I will provide an answer regarding the specific problem posted, although it's definitely not the way to go in such cases. But when simple solutions fail in your case, you may try to use this approach.
First you need a library consisting of 2 simple functions. They use System.Linq.Expressions.ExpressionVisitor to dynamically modify expressions. The key feature is unifying parameters inside the expression, so that 2 parameters with the same name were made identical (UnifyParametersByName). The remaining part is replacing a named parameter with given expression (ReplacePar). The library is available with MIT license on github: LinqExprHelper, but you may quickly write something on your own.
The library allows for quite simple syntax for combining complex expressions. You can mix inline lambda expressions, which are nice to read, together with dynamic expression creation and composition, which is very capable.
private static Expression<Func<Company, bool>> Combine(IList<Expression<Func<Company, bool>>> expressions)
{
if (expressions.Count == 0)
{
return null;
}
// Prepare a master expression, used to combine other
// expressions. It needs more input parameters, they will
// be reduced later.
// There is a small inconvenience here: you have to use
// the same name "c" for the parameter in your input
// expressions. But it may be all done in a smarter way.
Expression <Func<Company, bool, bool, bool>> combiningExpr =
(c, expr1, expr2) => expr1 && expr2;
LambdaExpression combined = expressions[0];
foreach (var expr in expressions.Skip(1))
{
// ReplacePar comes from the library, it's an extension
// requiring `using LinqExprHelper`.
combined = combiningExpr
.ReplacePar("expr1", combined.Body)
.ReplacePar("expr2", expr.Body);
}
return (Expression<Func<Company, bool>>)combined;
}
Assume you have two expression e1 and e2, you can try this:
var combineBody = Expression.AndAlso(e1.Body, Expression.Invoke(e2, e1.Parameters[0]));
var finalExpression = Expression.Lambda<Func<TestClass, bool>>(combineBody, e1.Parameters).Compile();

Combining multiple expressions (Expression<Func<T,bool>>) not working with variables. Why?

I've written a number of methods (.WhereOr, .WhereAnd) which basically allow me to "stack up" a bunch of lambda queries, and then apply them to a collection. For example, the usage with datasets would be a little like this (although it works with any class by using generics):
WITH LINQ TO DATASETS (Using the .NET DataSetExtensions)
DataTable Result;
List<Expression<Func<DataRow, bool>> Queries = new List<Expression<Func<DataRow, bool>>();
Queries.Add(dr=> dr.Field<string>("field1") == "somestring");
Queries.Add(dr=> dr.Field<string>("field2") == "somestring");
Queries.Add(dr=> dr.Field<string>("field3") == "somestring");
Result = GetSomeTable().AsEnumarable().WhereOr(Queries).CopyToDataTable();
Now say that in the example above only one row in the collection matches "somestring", and it's on field "field2".
That means that the count for Result should be 1.
Now, say I re-write the code above slightly to this:
DataTable Result;
List<Expression<Func<DataRow, bool>> Queries = new List<Expression<Func<DataRow, bool>>();
List<string> columns = new string[]{"field1","field2","field3"}.ToList();
string col;
foreach(string c in columns){
col = c;
Queries.Add(dr=> dr.Field<string>(col) == "somestring");
}
Result = GetSomeTable().AsEnumarable().WhereOr(Queries).CopyToDataTable();
Now, I don't really understand expressions, but to me both examples above are doing exactly the same thing.
Except that "Result" in the first example has a count of 1, and "Result" in the second example has a count of 0.
Also, in the List columns in the second example, if you put "field2" last, instead of second, then "Result" does correctly have a count of 1.
So, from all this I've come to a kind of conclusion, but I don't really understand what's happening, nor how to fix it..? Can I "evaluate" those expressions earlier...or part of them?
CONCLUSION:
Basically, it seems like, if I send literal values into there, like "field1", it works. But if I send in variables, like "col", it doesn't work, because those "expressions" are only getting evaluated much later in the code.
that would also explain why it works when I move "field2" to the last position. it works because the variable "col" was assigned to "field2" lastly, thus by the time the expressions evaluate "col" equals "field2".
Ok, so, is there any way around this??
Here's the code for my WhereOr method (it's an extension method to IENumerable):
public static IQueryable<T> WhereOr<T>(this IEnumerable<T> Source, List<Expression<Func<T, bool>>> Predicates) {
Expression<Func<T, bool>> FinalQuery;
FinalQuery = e => false;
foreach (Expression<Func<T, bool>> Predicate in Predicates) {
FinalQuery = FinalQuery.Or(Predicate);
}
return Source.AsQueryable<T>().Where(FinalQuery);
}
public static Expression<Func<T, bool>> Or<T>(this Expression<Func<T, bool>> Source, Expression<Func<T, bool>> Predicate) {
InvocationExpression invokedExpression = Expression.Invoke(Predicate, Source.Parameters.Cast<Expression>());
return Expression.Lambda<Func<T, bool>>(Expression.Or(Source.Body, invokedExpression), Source.Parameters);
}
The "how to fix" answer. Change this:
string col;
foreach(string c in columns) {
col = c;
Queries.Add(dr=> dr.Field<string>(col) == "somestring");
}
to this:
foreach(string c in columns) {
string col = c;
Queries.Add(dr=> dr.Field<string>(col) == "somestring");
}
Enjoy. The "what & why" answer was given by Brian.
I don't even bother reading the question bodies any more, I just read the title and then say
See
http://lorgonblog.spaces.live.com/blog/cns!701679AD17B6D310!689.entry
and
http://blogs.msdn.com/ericlippert/archive/2009/11/12/closing-over-the-loop-variable-considered-harmful.aspx
Oh geez, after I commented I saw the issue. You are using the same variable, "col", in every iteration of the loop. When you build the lambda expression, it doesn't bind to the value of the variable, it references the variable itself. By the time you execute the query, "Col" is set to whatever it's last value is. Try creating a temporary string inside the loop, setting its value, and using that.
I found great article to fulfill your requirements.
The link provides And<T> and Or<T> extension method to make it.
http://blogs.msdn.com/b/meek/archive/2008/05/02/linq-to-entities-combining-predicates.aspx

Resources