Is it possible to write join with a boolean evaluation? - linq

I want to write
CompareInfo myCompIntl = CompareInfo.GetCompareInfo( "es-ES" );
var SharedYomi = from ObjA in ClassListA
join ObjB in ClassListB
where CompareInfo.Compare(ObjA.Name, ObjB.Name) == 0
select new {stringA = stringA, string = string};
Linq forces me to write join with equals. I can not pass in a Boolean evaluation.
How can I do that?

You cannot write that using the LINQ lambda query syntax. The join keyword requires you to specify exactly two properties that are compared using the equals keyword, because this maps to the first overload of Join that uses the default comparer to compare keys.
However, there is an overload of Join that accepts an IEqualityComparer that will probably work for you, you just need to use method query syntax.
Since you sound like you're not familiar with the method syntax, here's a good starting article from MSDN:
http://msdn.microsoft.com/en-us/library/vstudio/bb397947.aspx
But, basically, the syntax you think of as "LINQ" is just one way to refer to the LINQ extensions, and is really just syntactic sugar around the IEnumerable extension methods that implement LINQ. So, for example, the query:
from x in y where x.IsActive orderby x.Name select x
it basically identical to
y.Where(x => x.IsActive).OrderBy(x => x.Name).Select(x => x);
For the most part, each query clause maps to a particular overload of a particular IEnumerable method, but those methods have a number of other overloads that take different numbers and types of parameters.
The Join methods are a bit complex, because they take two sequences as input and let you combine individual elements of them using expressions, but the idea is exactly the same. A typical join would look like this:
from x in y
join a in b on x.Id equals a.ParentId
select new { x.Id, x.Name, a.Date }
becomes
y.Join(
b,
x => x.Id,
a => a.ParentId,
(x, a) => new { x.Id, x.Name, a.Date });
This will join a.ParentId and x.Id using the default comparison for their data type (int, string, whatever). The compiler directly translates the query syntax into method syntax, so the two behave exactly the same. (Nitpicking my own answer: Technically, the methods are on the Enumerable class, so you are really calling Enumerable.Join. But as they were implemented as extension methods, you can call them either way and the compiler will figure it out.)
In your case, what you need is to pass in a different comparison method, so you can call string.Compare with the explicit encoding. The other overload of Join lets you supply an implementation of IEqualityComparer<T> to use instead of the default. This will require you to implement IEqualityComparer<string> in a separate class, since there's no easy way to create an anonymous interface implementation (perhaps the only feature I miss from Java). For your example, you want something like this:
public class ComparerWithEncoding : IEqualityComparer<string>
{
private CompareInfo compareInfo
public ComparerWithEncoding ( string encoding )
{
this.compareInfo = CompareInfo.GetCompareInfo(encoding);
}
public bool Equals ( string a, string b )
{
return CompareInfo.Compare(a, b) == 0
}
public int GetHashCode(string a)
{
return a.GetHashCode();
}
}
classListA.Join(
ClassListB,
ObjA => ObjA.Name,
ObjB => ObjB.Name,
(ObjA, ObjB) => new { stringA = ObjA.Foo, stringB = ObjB.Bar },
new ComparerWithEncoding("es-ES"));

Related

How can I use func in a C# EF dbquery statement for sorting?

I have to do multi-part sorts and want to do it dynamically.
I found this question but do not know how to use func in a dbquery statement.
No generic method 'ThenBy' on type 'System.Linq.Queryable'
If I could get the code in the thread to work it would be nirvana.
All the examples I have seen use then within a where statement, but I need to use the function to do sorting.
I have written extensions using IQueryable, including ones for orderby and orderbydescending. The problem is thenby and thenbydescending use iorderedqueryable.
The error I get when using ThenByProperty is
Object of type 'System.Data.Entity.Infrastructure.DbQuery1[ORMModel.v_Brand]' cannot be converted to type 'System.Linq.IOrderedEnumerable1[ORMModel.v_Brand]'.
Do not get such an error when I use a comparable OrderByProperty extension.
what a mess, obviously I do not post often here. Anyway I am stumped and clueless so any tips are very appreciated.
Tried to post code but kept getting format errors so gave up. But help me anyways :)
If you use method syntax, you'll see func quite often, for instance in Where, GroupBy, Join, etc
Every method with some input parameters and one return value can be translated to a Func<...> as follows
MyReturnType DoSomething(ParameterType1 p1, ParameterType2, p2) {...}
Func<ParameterType1, ParameterType2, MyReturnType> myFunc = (x, y) => DoSomething(x, y);
The part Func<ParameterType1, ParameterType2, MyReturnType> means: a function with two input parameters and one return value. The input parameters are of type ParameterType1 and ParameterType2, in this order. The return value is of MyReturnType.
You instantiate an object of Func<ParameterType1, ParameterType2, MyReturnType> using a lambda expression. Before the => you type a declaration for the input parameters, after the => you call the function with these input parameters. If you have more than one input parameter you make them comma separated surrounded by brackets.
For a Where you need a Func<TSource, bool>. So a function that has as input one source element, and as result a bool:
Where(x => x.Name == "John Doe")
For a GroupJoin you need a resultSelector of type Func<TOuter,System.Collections.Generic.IEnumerable<TInner>,TResult> resultSelector
So this is a function with as input one element of the outer sequence, and a sequence of elements of the inner sequence. For example, to query Teachers with their Students:
var result = Teachers.GroupJoin(Students,
teacher => teacher.Id, // from every Teacher take the Id,
student => student.TeacherId, // from every Student take the TeacherId,
(teacher, students) => new
{
Id = teacher.Id,
Name = teacher.Name,
Students = students.Select(student => new
{
Id = student.Id,
Name = student.Name,
})
.ToList(),
});
Here you see several Funcs. TOuter is Teacher, TInner is Student, TKey is int
OuterKeySelector: Func<TOuter, TKey>: teacher => teacher.Id
InnerKeySelector: Func<TInner, TKey>: student => student.TeacherId
ResultSelector: Func<Touter, IEnumerable<TInner>, TResult>
The resultSelector is a function that takes one TOuter (a Teacher), and a sequence of TInner (all students of this Teacher) and creates one object using the input parameters
(teacher, students) => new {... use teacher and students }
When creating the lambda expression it is often helpful if you use plurals to refer to collections (teachers, students) and singulars if you refer one element of a collection (student).
Use the => to start defining the func. You can use input parameters that were defined before the => to define the result after the =>

Why is the "Select" of a Method Syntax is in another parenthesis?

var sample = db.Database.OrderByDescending(x => x.RecordId).Select(y => y.RecordId).FirstOrDefault();
I don't know if my title is correct / right. Just want to ask why this query the select is in another ( )?. As for the example .Select(y => y.RecordId) unlike the query I use to be
var sample = (from s in db.Databse where s.RecordId == id select s) I know this is the same right?. Then what is the why it is in another parenthesis?. Anyone has an idea or can anyone explain it why?. Thanks a lot.
In your first example, you're using "regular" C# syntax to call a bunch of extension methods:
var sample = db.Database
.OrderByDescending(x => x.RecordId)
.Select(y => y.RecordId)
.FirstOrDefault();
(They happen to be extension methods here, but of course they don't have to be...)
You use lambda expressions to express how you want the ordering and projection to be performed, and the compiler converts those into expression trees (assuming this is EF or similar; it would be delegates for LINQ to Objects).
The second example is a query expression, although it doesn't actually match your first example. A query expression corresponding to your original query would be:
var sample = (from x in db.Database
orderby x.RecordId descending
select x.RecordId)
.FirstOrDefault();
Query expressions are very much syntactic sugar. The compiler effectively converts them into the first form, then compiles that. The range variable declared in the from clause (x in this case) is used as the parameter name for the lambda expression, so select x.RecordId becomes .Select(x => x.RecordId).
Things become a bit more complicated with joins and multiple from clauses, as then the compiler introduces transparent identifiers to allow you to work with all the range variables that are in scope, even though you've really only got a single parameter. For example, if you had:
var query = from person in people
from job in person.Jobs
order by person.Name
select new { Person = person, Job = job };
that would be translated into the equivalent of
var query = people.SelectMany(person => person.Jobs, (person, job) => new { person, job } )
.OrderBy(t => t.person.Name)
.Select(t => new { Person = t.person, Job = t.job });
Note how the compiler introduces an anonymous type to combine the person and job range variables into a single object, which is used later on.
Basically, query expression syntax makes LINQ easier to work with - but it's just a translation into other C# code, and is neatly wrapped up in a single section of the C# specification. (Section 7.16.2 of the C# 5 spec.)
See my Edulinq blog post on query expressions for more detail on the precise translation from query expressions to "regular" C#.

Is Select optional in a LINQ statement?

I was looking over some LINQ examples, and was thereby reminded they are supposed to have a "select" clause at the end.
But I have a LINQ that's working and has no "Select":
public IEnumerable<InventoryItem> Get(string ID, int packSize, int CountToFetch)
{
return inventoryItems
.Where(i => (i.Id.CompareTo(ID) == 0 && i.PackSize > packSize) || i.Id.CompareTo(ID) > 0)
.OrderBy(i => i.Id)
.ThenBy(i => i.PackSize)
.Take(CountToFetch)
.ToList();
}
Is this because:
(a) select is not really necessary?
(b) Take() is doing the "select"
(c) ToList() is doing the "select"
Truth be told, this was working before I added the "ToList()" also... so it seems LINQ is quite permissive/lax in what it allows one to get away with.
Also, in the LINQ I'm using, I think the OrderBy and ThenBy are redundant, because the SQL query used to populate inventoryItems already has an ORDER BY ID, PackSize clause. Am I right (that the .OrderBy() and .ThenBy() are unnecessary)?
Linq statements do in fact need a select clause (or other clauses, such as a group by). However, you're not using Linq syntax, you're using the Linq Enumerable extension methods, which all (for the most part) return IEnumerable<T>. Therefore, they do not need the Select operator.
var result = from item in source
where item.Value > 5
select item;
Is exactly the same as
var result = source.Where(item => item.Value > 5);
And for completeness:
var result = from item in source
where item.Value > 5
select item.Value;
Is exactly the same as
var result = source.Where(item => item.Value > 5)
.Select(item => item.Value);
Linq statements (Linq syntax statements) need a special clause at the end to signify what the result of the query should be. Without a select, group by, or other selection clause, the syntax is incomplete, and the compiler does not know how to change the expression into the appropriate extension methods (which is what Linq syntax actually gets compiled to).
As far as ToList goes, that's one of the Enumerable extension methods that does not return an IEnumerable<t>, but instead a List<T>. When you use ToList or ToArray the Enumerable is enumerated immediately and converted to a list or array. This is useful if your query is complex and you need to enumerate the results multiple times without running the query multiple times).
You only use select when you want to project your object into a different type..
if you had a list that contains an object with an ID property that was an int
var newList = items.Select(i => i.ID);
newList would be an IEnumerable<int>
NB.
A common mistake is to mix up a Select with a Where.
items.Where(i => i.ID == 1); returns an IEnumerable<item>
items.Select(i => i.ID == 1); returns an IEnumerable<bool>
as the Select projects each item into the result of the function passed in..

Linq in conjunction with custom generic extension method build error - An expression tree may not contain an assignment operator?

I've made a generic extension method that executes an action on an object and returns the object after that:
public static T Apply<T>(this T subject, Action<T> action)
{
action(subject);
return subject;
}
I'm unable to use this extension method in an EntityFramework Linq query because of:
An expression tree may not contain an assignment operator
Why is this?
The Linq query:
var parents = from p in context.Parent
join phr in context.Child on p.key equals phr.parentkey
into pr
select p.Apply(
x => x.Children = //The assignment operator that fails to build...
pr.ToDictionary(y => y.childkey, y => y.childname));
Well, assignment operator aside, how would you expect your Apply method to be translated into SQL? Entity Framework doesn't know anything about it, and can't delve into opaque delegates, either.
I suspect what you really need to do is separate out the bits to do in the database from the bits to do locally:
var dbQuery = from p in context.Parent
join phr in context.Child on p.key equals phr.parentkey into pr
select new { p, phr };
var localQuery = dbQuery.AsEnumerable()
.Select(pair => /* whatever */);

Var vs IEnumerable when it comes to Entity Framework

If I were to use IEnumerable instead of var in the code example below, will the SQL be generated only during the execution of the foreach statement? Or will it execute as an when the Linq statements are evaluated?
var query = db.Customers.Where (c => c.Age > 18);
query = query.Where (c => c.State == "CO");
var result = query.Select (c => c.Name);
foreach (string name in result) // Only now is the query executed!
Console.WriteLine (name);
Another example:
IEnumerable<Order> query = db.Orders.Where(o => o.Amount > 1000);
int orderCount = query.Count();
Would it be better to use var (or IQueryable) as it would be executed a select count(*)... when .Count() is executed or would it be exactly same with the IEnumerable code shown above?
It would make no difference. var is just syntactic sugar. If you hover over var, you will see what type C# thinks your query is.
From http://msdn.microsoft.com/en-us/library/bb383973.aspx
Beginning in Visual C# 3.0, variables that are declared at method scope can have an implicit type var. An implicitly typed local variable is strongly typed just as if you had declared the type yourself, but the compiler determines the type. The following two declarations of i are functionally equivalent:
var i = 10; // implicitly typed
int i = 10; //explicitly typed
If you want to perform actions on your query that SQL wouldn't know what to do with, such as a method defined in your class, then you could use AsEnumerable().
For example:
var query = db.Customers
.Where(c => c.Age > 18)
.AsEnumerable()
.Select(c => GetCustomerViewModel());
//put definition of GetCustomerViewModel() somewhere in the same class
Update
As Omar mentioned, if you want your query to be executed immediately (rather than deferred), use ToList(). This will immediately enumerate the IQueryable/IEnumerable and fill the list with the actual data from the database.
In general, the SQL is generated when GetEnumerator is called on the IQueryable.
In your example, there is a subtle difference that you may want to consider. Let's take your original example:
var query = db.Customers.Where (c => c.Age > 18);
query = query.Where (c => c.State == "CO");
var result = query.Select (c => c.Name);
In this case if you change your first query to IEnumerable query = ..., then the second line would use the IEnumerable version of the Where extension (LINQ to Objects) rather than the IQueryable one (LINQ to SQL/EF). As a result, when we start iterating, the first where clause is passed to the database, but the second where clause is performed on the client side (because it has already been converted to an IEnumerable).
Another subtle item you want to be aware of is the following type of code:
var query = db.OrderBy(c => c.State);
query = query.Customers.Where(c => c.Age > 18); // Fails: Widening
In this case, since your original query returns IOrderedQueryable rather than IQueryable. If you try to then assign query to the result of the .Where operation, you're trying to widen the scope of the return type and the compiler will refuse to perform that widening. As a result, you have to explicitly specify the baseline type rather than using var:
IQueryable<Customer> query = db.OrderBy(c => c.State); // Is narrowing rather than widening.
query = query.Customers.Where(c => c.Age > 18);
Linq queries return IQueryable<> or IEnumerable<>, the execution of both is deferred.
As DanM stated, whether or not you use var or IEnumerable<> it all depends on the return value of the method you're calling: if it's an IEnumerable<> or IQuerable<> it'll be deferred, if you use .ToList(), it'll be executed right away.
When to use var comes down to personal choice/style. I generally use var when the return value is understood from the line of code and variable name or if I'm instantiating a generic with a long declartion, e.g. Dictionary<string, Func<Order, object>>.
From your code, it's clear that a collection of Customers/Orders is returned, so I would use the var keyword. Again, this is a matter of personal preference.

Resources