How Does Queryable.OfType Work? - linq

Important The question is not "What does Queryable.OfType do, it's "how does the code I see there accomplish that?"
Reflecting on Queryable.OfType, I see (after some cleanup):
public static IQueryable<TResult> OfType<TResult>(this IQueryable source)
{
return (IQueryable<TResult>)source.Provider.CreateQuery(
Expression.Call(
null,
((MethodInfo)MethodBase.GetCurrentMethod()).MakeGenericMethod(
new Type[] { typeof(TResult) }) ,
new Expression[] { source.Expression }));
}
So let me see if I've got this straight:
Use reflection to grab a reference to the current method (OfType).
Make a new method, which is exactly the same, by using MakeGenericMethod to change the type parameter of the current method to, er, exactly the same thing.
The argument to that new method will be not source but source.Expression. Which isn't an IQueryable, but we'll be passing the whole thing to Expression.Call, so that's OK.
Call Expression.Call, passing null as method (weird?) instance and the cloned method as its arguments.
Pass that result to CreateQuery and cast the result, which seems like the sanest part of the whole thing.
Now the effect of this method is to return an expression which tells the provider to omit returning any values where the type is not equal to TResult or one of its subtypes. But I can't see how the steps above actually accomplish this. It seems to be creating an expression representing a method which returns IQueryable<TResult>, and making the body of that method simply the entire source expression, without ever looking at the type. Is it simply expected that an IQueryable provider will just silently not return any records not of the selected type?
So are the steps above incorrect in some way, or am I just not seeing how they result in the behavior observed at runtime?

It's not passing in null as the method - it's passing it in as the "target expression", i.e. what it's calling the method on. This is null because OfType is a static method, so it doesn't need a target.
The point of calling MakeGenericMethod is that GetCurrentMethod() returns the open version, i.e. OfType<> instead of OfType<YourType>.
Queryable.OfType itself isn't meant to contain any of the logic for omitting returning any values. That's up to the LINQ provider. The point of Queryable.OfType is to build up the expression tree to include the call to OfType, so that when the LINQ provider eventually has to convert it into its native format (e.g. SQL) it knows that OfType was called.
This is how Queryable works in general - basically it lets the provider see the whole query expression as an expression tree. That's all it's meant to do - when the provider is asked to translate this into real code, that's where the magic happens.
Queryable couldn't possibly do the work itself - it has no idea what sort of data store the provider represents. How could it come up with the semantics of OfType without knowing whether the data store was SQL, LDAP or something else? I agree it takes a while to get your head round though :)

Related

Understanding Html.ActionLink(..., ..., ...) syntax in MVC3

I'm doing a tutorial on MVC 3 and I stumbled upon the helper #Html.ActionLink(genre.Name, "Browse", new {genre = genre.Name}).
Now I understand what the values do and that the third value is a route parameter value but this is the first time I'm seeing this kind of syntax which is really bugging me for some reason.
What I mean exactly is new {genre = genre.Name}. I've come to understand that "new" precedes object/type declaration, however, this time it's simply the "new" keyword and the curly brackets. How exactly is this processed?
The syntax new { prop = val } creates an anonymous type. It's essentially the same as creating an instance of a class, except you're declaring the class and the instance all in one shot. Some people think that anonymous types are not statically typed or are not type safe. This isn't true. The types of the properties are inferred from the values they are assigned. This construction is used frequently in MVC and in linq.
Note that this syntax is not specific to MVC. You can use it anywhere it's convenient. I make a fair amount of use of anonymous types in day-to-day coding.
It's simple.. the first parameter is the link you want to display, so genre.Name can corresponds to Rock. The second argument is the action, the third argument your Controller class. The last parameter is the route values in the form of an anonymous object (an object you will never use again, the MVC engine uses the anonymous object in this case).
So your action(method) takes a string argument.
For example:
"Home" is the link the user sees (the first argument), Home (the second argument) is the action (method) to your Controller class, and it takes a string argument.
class HomeController
{
public ActionResult GenreAction(string genre)
{
}
}
When a request is made, it becomes Home/GenreAction/genre
It's a C# language feature called Anonymous Type, introduced with C# 3.5 if I'm not mistaken.

Linq To Sql Where does not call overridden Equals

I'm currently working on a project where I'm going to do a lot of comparison of similar non-database (service layer objects for this discuss) objects and objects retrieved from the database via LinqToSql. For the sake of this discussion, assume I have a service layer Product object with a string field that is represented in the database. However, in the database, there is also a primary key Id that is not represented in the service layer.
Accordingly (as I often do for unit testing etc), I overrode Equals(Object), Equals(Product), and GetHashCode and implemented IEquatable with the expectation that I would be able to write code like this:
myContext.Products.Where(p => p.Equals(passedInProduct).SingleOrDefault();
And so forth.
The Equals override is tested and works. The objects are mutable so the usual caveats apply to the GetHashCode override. However, for the purposes of this example, the objects are not modified except by LtS and could be made readonly.
Here's the simple test:
Create a test object in memory and commit to the LtS context. By committing, the test object is populated with a few auto-generated fields.
Create another identical test object in memory (separate reference)
Attempt to retrieve the first object from the database using the second object as the criteria. (see code line above).
// Setup
string productDesc = "12A";
Product testProduct1 = _CreateTestProductInDatabase(productDesc);
Product testProduct2 = _CreateTestProduct(productDesc);
// check setup
Product retrievedProduct1 = ProductRepo.Retrieve(testProduct1);
//Assert.IsNotNull(retrievedProduct1);
// execute - try to retrieve the 'equivalent' product object
Product retrievedProduct2 = ProductRepo.Retrieve(testProduct2);
A simplified version of Retrieve (cruft removed is just parameter checks etc):
using (var dbContext = new ProductDataContext()) {
Product retrievedProduct = dbContext.Products
.Where(p => p.Equals(product)).SingleOrDefault();
NB: The overridden Equals method knows not to care about the auto-generated fields from the database and only looks at the string that is represented in the service layer.
Here's what I observed:
Retrieve on testProduct1 succeeds (no surprise, equal by reference)
Retrieve on testProduct2 fails (null)
The overridden Equals method called in the Retrieve method is never hit during either Retrieve calls
However, the overridden Equals method is called multiple times by the context on SubmitChanges (called when creating the first test object in the database) (works as expected).
Statically, the compiler knows that the type of the objects being emitted and is able to resolve the type.
So my specific questions:
Am I trying to do something ill-advised? Seems like a straightforward use of Equals.
Corollary to first question: alternate suggestions to deal with linq to sql equality checking while keeping comparison details inside the objects rather than the repository
Why might I have observed the Equals method being resolved in SubmitChanges but not in the Where clause?
I'm as much interested in understanding as making my Equals calls work. But I also would love to learn how to make this 'pattern' work rather than just understand why it appears to be an 'anti-pattern' in the contest of LtS and C#.
Please don't suggest I just filter directly on the context with Where statements. Obviously, I can remove the Equals call and do that. However, some of the other objects (not presented here) are large and a bit complicated. For the sake of maintenance and clarity, I want to keep knowledge of how to compare itself to another of its own type in one place and ideally as part of the object in question.
Some other things I tried that didn't change the behavior:
Overloaded and used == instead
Casting the lambda variable to the type p => (Product)p
Getting an IQueryable object first and calling Equals in the Where clause
Some other things I tried that didn't work:
Creating a static ProductEquals(Product first, Product second) method: System.NotSupportedException:has no supported translation to SQL.
Thanks StackOverflow contributors!
Re Possible dups: I've read ~10 other questions. I'd love a pointer to an exact duplicate but most don't seem to directly address what seems to be an oddity of LinqToSql.
Am I trying to do something ill-advised?
Absolutely. Consider what LINQ to SQL does: it creates a SQL representation of your query. It doesn't know what your overridden Equals method does, so can't translate that logic into SQL.
Corollary to first question: alternate suggestions to deal with linq to sql equality checking while keeping comparison details inside the objects rather than the repository
You'd need to do something with expression trees to represent the equality that way - and then build those expression trees up into a full query. It won't be fun, but it should be possible. It will affect how you build all your queries though.
I would have expected most database representations to be ID-based though, so you should be able to just compare IDs for equality. Usually when I've seen attempts to really model data in an OO fashion but store it in a database, the leakiness of the abstraction has caused a lot of pain.
Why might I have observed the Equals method being resolved in SubmitChanges but not in the Where clause?
Presumably SubmitChanges is working against a set of in-memory objects to work out what's changed - it doesn't have to do any conversion to SQL to do that part.

Expression Lambda - retriveing Filed

is it possibile executing through Expression class a Select (projection) passing data field as string in order to achieve a strongly typed collection?
That' because I'm working with Linq to Entities and I would be able to making some retrive by grabbing wpf grid column name.
IS it exist something like Expression.Lamba.Select("field1, field2") which return a List..?
You could create a method that would call Select() with an expression that creates a Tuple (or possibly something else) from properties in your entity and let EF handle the rest.
The problem is, the only way you could treat the result of such method in a strongly-typed way would be if you knew the exact type it should return at compile-type, which it seems you don't.
The best you can do is to treat the result as a non-generic IEnumerable or alternatively try to use dynamic.

Workarounds for using custom methods/extension methods in LINQ to Entities

I have defined a GenericRepository class which does the db interaction.
protected GenericRepository rep = new GenericRepository();
And in my BLL classes, I can query the db like:
public List<Album> GetVisibleAlbums(int accessLevel)
{
return rep.Find<Album>(a => a.AccessLevel.BinaryAnd(accessLevel)).ToList();
}
BinaryAnd is an extension method which checks two int values bit by bit. e.g. AccessLevel=5 => AccessLevel.BinaryAnd(5) and AccessLevel.binaryAnd(1) both return true.
However I cannot use this extension method in my LINQ queries. I get a runtime error as follows:
LINQ to Entities does not recognize the method 'Boolean BinaryAnd(System.Object, System.Object)' method, and this method cannot be translated into a store expression.
Also tried changing it to a custom method but no luck. What are the workarounds?
Should I get all the albums and then iterate them through a foreach loop and pick those which match the AccessLevels?
I realize this already has an accepted answer, I just thought I'd post this in case someone wanted to try writing a LINQ expression interceptor.
So... here is what I did to make translatable custom extension methods: Code Sample
I don't believe this to be a finished solution, but it should hopefully provide a good starting point for anyone brave enough to see it through to completion.
You can only use the core extension methods and CLR methods defined for your EF provider when using Entity Framework and queries on IQueryable<T>. This is because the query is translated directly to SQL code and run on the server.
You can stream the entire collection (using .ToEnumerable()) then query this locally, or convert this to a method that is translatable directly to SQL by your provider.
That being said, basic bitwise operations are supported:
The bitwise AND, OR, NOT, and XOR operators are also mapped to canonical functions when the operand is a numeric type.
So, if you rewrite this to not use a method, and just do the bitwise operation on the value directly, it should work as needed. Try something like the following:
public List<Album> GetVisibleAlbums(int accessLevel)
{
return rep.Find<Album>(a => (a.AccessLevel & accessLevel > 0)).ToList();
}
(I'm not sure exactly how your current extension method works - the above would check to see if any of the flags come back true, which seems to match your statement...)
There are ways to change the linq query just before EF translates it to SQL, at that moment you'd have to translate your ''foreign'' method into a construct translatable by EF.
See an previous question of mine How to wrap Entity Framework to intercept the LINQ expression just before execution? and mine EFWrappableFields extension which does just this for wrapped fields.

Using return statements to great effect!

When I am making methods with return values, I usually try and set things up so that there is never a case when the method is called in such a way that it would have to return some default value. When I started I would often write methods that did something, and would either return what they did or, if they failed to do anything, would return null. But I hate having ugly if(!null) statements all over my code,
I'm reading a re-guide to ruby that I read many moons ago, by the pragmatic programmers, and I notice that they often return self (ruby's this) when they wouldn't normally return anything. This is, they say, in order to be able to chain method calls, as in this example using setters that return the object whose attributes they set.
tree.setColor(green).setDecor(gaudy).setPractical(false)
Initially I find this sort of thing attractive. There have been a couple of times when I have rejoiced at being able to chain method calls, like Player.getHand().getSize() but this is somewhat different in that the object of the method call changes from step to step.
What does Stack Overflow think about return values? Are there any patterns or idioms that come to mind warmly when you think of return values? Any great ways to avoid frustration and increase beauty?
In my humble opinion, there are three kinds of return-cases that you should take into consideration:
Object property manipulation
The first is the manipulation of object properties. The pattern you describe here is very often used when manipulating objects. A very typical scenario is using it together with a factory. Consider this hypothetical creation call:
// When the object has manipulative methods:
Pizza p = PizzaFactory().create().addAnchovies().addTomatoes();
// When the factory has manipulative methods working on the
// object, IMHO more elegant from a semantic point of view:
Pizza p = PizzaFactory().create().addAnchovies().addTomatoes().getPizza();
It allows for a quick grasp at what exactly is being created or how an object is manipulated, because the methods form one human-readable expression. It's definitely nice, but don't overuse. A rule of thumb is that this might be used with methods whose return value you could also declare as void.
Evaluating object properties
The second might be when a method evaluates something on an object. Consider, for example, the method car.getCurrentSpeed(), that could be interpreted as a message to an object asking for the current speed and returning that. It would simply return the value, not too complicated. :)
Make object do this or that
The third might be when a method makes an perform an operation, returning some sort of value indicating how well the caller's intention was fulfilled - but laying out such a method could be difficult:
int new_gear = 20;
if (car.gears.changeGear(new_gear)) // does that mean success or fail?
This is where you can see a difficulty in designing the method. Should it return 0 upon success or failure? How about -1 if the gear could not be set, because the car only has 5 gears? Does that mean the current gear is at -1 now, too? The method could return the gear it changed to, meaning you would have to compare the argument supplied to the method to the return code. That would work. On the other hand, you could simply return either true or false for failure or false or true for failure. Which one to use could be decided by estimating if you'd expect those method calls to rather fail or succeed.
In my humble opinion, there is a way to better express the semantics of such return values, by giving them a semantic description. Future developers interacting with your objects will love you for not having to look up the comments or documentation for your methods:
class GearSystem {
// (...)
public:
enum GearChangeResult
{ GearChangeSuccess, NonExistingGear, MechanicalGearProblem };
GearChangeResult changeGear (int gear);
};
That way, it becomes perfectly obvious for any programmer looking at your code, what the return value means (consider: if (gears.changeGear(20) == GearSystem::GearChangeSuccess) - much clearer what that means than the example above)
Antipattern: Failures as return codes.
The fourth possibility for a return value I actually omitted, because in my opinion it isn't any: when there's an error in your program, like a logic error or a failure that needs to be dealt with - you could theoretically return a value indicating so. But today, that's not done so often anymore (or should not be), because for that, there are exceptions.
I don't agree that methods should never return null. The most obvious examples are from systems programming. For instance, if someone asks to open a file, you simply have to give them null if the open fails. There is no sane alternative. There are other cases where null is appropriate, such as a getNextNode(node) method, when called on the last node of a linked list. So I guess what these cases have in common is that null represents "no object" (either no file handle or no list node), which makes sense.
In other cases, the method should never fail, and there is an appropriate exception facility. Then, I think method chaining like your example can be used to great effect. I think it's a bit funny that you seem to believe this is an innovation of the "Pragmatic Programmers". In fact, it dates to Lisp if not before.
Returning this is also used in the "builder pattern", another case where method chaining can enhance readability as well as writing convenience.
A null is often returned as an out-of-band value to indicate that no result could be produced. I believe that this is perfectly reasonable when getting no result is a normal event; examples would include a null return from readLine() at end-of-file, or a null returned when providing a non-existent key to the get(...) method of a Map. Reading to the end of the file is normal behavior (as opposed to an IOException, which indicates that something went abnormally wrong while trying to read). Similarly, looking up a key and being told that it has no value is a normal case.
A good alternative to null for some cases is a "null object", which is a full-fledged instance of the result class, but which has appropriate state and behavior for a "nobody's home" case. For instance, the result of looking up a non-existent user ID might well be a NullUser object which has a zero-length name and no permissions to do anything in the system.
It's confusing to me. OO programming languages need Smalltalk's semicolon:
tree color: green;
decor: gaudy;
practical: false.
obj method1; method2. means "call method1 on obj then method2 on obj". This kind of object setup is very common.

Resources