Strange problem with LINQ to NHibernate and string comparison - linq

I'm using LINQ to NHibernate and encountered a strange problem while comparing strings. Following code works fine but when I un-comment:
//MyCompareFunc(dl.DamageNumber, damageNumberSearch) &&
and comment:
dl.DamageNumber.Contains(damageNumberSearch) &&
then it breaks down and seems that MyCompareFunc() always return true while dl.DamageNumber.Contains(damageNumberSearch) sometimes return true and sometimes returns false.
In other words when I use string.Contains() in LINQ query directly it works, but when I move it to a method, it does not work.
internal List<DamageList> SearchDamageList(
DateTime? sendDateFromSearch, DateTime? sendDateToSearch, string damageNumberSearch,
string insuranceContractSearch)
{
var q = from dl in session.Linq<DamageList>()
where
CommonHelper.IsDateBetween(dl.SendDate, sendDateFromSearch, sendDateToSearch) &&
//MyCompareFunc(dl.DamageNumber, damageNumberSearch) &&
dl.DamageNumber.Contains(damageNumberSearch) &&
insuranceContractSearch == null ? true : CommonHelper.IsSame(dl.InsuranceContract, insuranceContractSearch)
select dl;
return q.ToList<DamageList>();
}
private bool MyCompareFunc(string damageNumber, string damageNumberSearch)
{
return damageNumber.Contains(damageNumberSearch);
}

I have to admit I'm not an expert in NHibernate, but while using a different ORM we have frequently run into the same kind of problem. The thing is that the LINQ engine, while translating the query, is capable of recognizing simple string functions from .NET library like Contains and translating them into the SQL equivalent. This SQL equivalent does the comparison case-insensitive (it depends on the settings of the database, but that's usually the default).
On the other hand, it's not possible for him to parse the source code of your custom function and therefore it can't translate it into SQL and has to just execute it in memory after preloading the result of the previous query from the database. This means it is executed as a .NET code, where the comparison is done by default case-sensitive.
That could be the reason for your mismatch of results ;)

Linq works with expressions, not with compliled functions. It will be fine if you use expression> instead of the "compiled" method.

Related

Incorrect Resharper Suggestion - "Merge Conditional Expression"

Given an oracle query which is returning a single string value, to get the value from the query I use the following two lines:
var result = cmd.ExecuteOracleScalar();
return result != null ? ((OracleString)result).Value : null;
The "!= null" in this statement is underlines with the suggestion to "Merge Conditional Expression". If I accept the suggestion it changes it to:
return ((OracleString)result).Value;
Which throws an exception because the value returned will be null for a number of executions.
Is there anyway to use the ternary operator but not have this warning?
Note that if I change the code to:
var result = cmd.ExecuteOracleScalar();
if (result == null)
return null;
return ((OracleString)result).Value;
Resharper then first suggests that I "Convert to Return Statement" which just changes it back to use the ternary operator.
Any Suggestions?
This looks like exactly the bug identified in RSRP-434610:
given original code that checks an object reference for nullity, and accesses a property of the object if the object reference is not null
R# proposes a refactoring that always accesses the property, and will therefore fail when the object reference is null
The issue has a fix version of 9.1, which was released just a few days ago, although watch out trying to upgrade from within VS.

Why does using Linq's .Select() return IEnumerable<dynamic> even though the type is clearly defined?

I'm using Dapper to return dynamic objects and sometimes mapping them manually. Everything's working fine, but I was wondering what the laws of casting were and why the following examples hold true.
(for these examples I used 'StringBuilder' as my known type, though it is usually something like 'Product')
Example1: Why does this return an IEnumerable<dynamic> even though 'makeStringBuilder' clearly returns a StringBuilder object?
Example2: Why does this build, but 'Example1' wouldn't if it was IEnumerable<StringBuilder>?
Example3: Same question as Example2?
private void test()
{
List<dynamic> dynamicObjects = {Some list of dynamic objects};
IEnumerable<dynamic> example1 = dynamicObjects.Select(s => makeStringBuilder(s));
IEnumerable<StringBuilder> example2 = dynamicObjects.Select(s => (StringBuilder)makeStringBuilder(s));
IEnumerable<StringBuilder> example3 = dynamicObjects.Select(s => makeStringBuilder(s)).Cast<StringBuilder>();
}
private StringBuilder makeStringBuilder(dynamic s)
{
return new StringBuilder(s);
}
With the above examples, is there a recommended way of handling this? and does casting like this hurt performance? Thanks!
When you use dynamic, even as a parameter, the entire expression is handled via dynamic binding and will result in being "dynamic" at compile time (since it's based on its run-time type). This is covered in 7.2.2 of the C# spec:
However, if an expression is a dynamic expression (i.e. has the type dynamic) this indicates that any binding that it participates in should be based on its run-time type (i.e. the actual type of the object it denotes at run-time) rather than the type it has at compile-time. The binding of such an operation is therefore deferred until the time where the operation is to be executed during the running of the program. This is referred to as dynamic binding.
In your case, using the cast will safely convert this to an IEnumerable<StringBuilder>, and should have very little impact on performance. The example2 version is very slightly more efficient than the example3 version, but both have very little overhead when used in this way.
While I can't speak very well to the "why", I think you should be able to write example1 as:
IEnumerable<StringBuilder> example1 = dynamicObjects.Select<dynamic, StringBuilder>(s => makeStringBuilder(s));
You need to tell the compiler what type the projection should take, though I'm sure someone else can clarify why it can't infer the correct type. But I believe by specifying the projection type, you can avoid having to actually cast, which should yield some performance benefit.

How to match a enum value with some enum values using linq

I want to know what could be the shortest linq query instead of following if statement.
public enum ErrorMessage { Error1=1, Error2=2, Error3=3, Error4=4 }
ErrorMessage error = ErrorMessage.Error4;
if (error == ErrorMessage.Error1 || error == ErrorMessage.Error2)
{
//do something
}
Linq will make this code complicated,
code you provide is readable, fast and maintainable more than Linq will be
You could use
if (new [] {ErrorMessage.Error1, ErrorMessage.Error2}.Contains(error))
{
//do something
}
or
var bad_errors = new [] {ErrorMessage.Error1, ErrorMessage.Error2};
if (bad_errors.Contains(error))
{
//do something
}
if a single call to an extension method is LINQ enough for you.
I guess to most C# developers such a pattern seems strange (and it totally is), but if you're already working on a dynamically created list of errors you want to check against...
Otherwise, stick with if.
It actually works nicer in languages with less boilerplate, e.g. Python, where this pattern is commonly used and looks a lot nicer:
if error in (Error1, Error2):
# do something

How can I intercept the result of an IQueryProvider query (other than single result)

I'm using Entity Framework and I have a custum IQueryProvider. I use the Execute method so that I can modify the result (a POCO) of a query after is has been executed. I want to do the same for collections. The problem is that the Execute method is only called for single result.
As described on MSDN :
The Execute method executes queries that return a single value
(instead of an enumerable sequence of values). Expression trees that
represent queries that return enumerable results are executed when
their associated IQueryable object is enumerated.
Is there another way to accomplish what I want that I missed?
I know I could write a specific method inside a repository or whatever but I want to apply this to all possible queries.
This is true that the actual signature is:
public object Execute(Expression expression)
public TResult Execute<TResult>(Expression expression)
However, that does not mean that the TResult will always be a single element! It is the type expected to be returned from the expression.
Also, note that there are no constraints over the TResult, not even 'class' or 'new()'.
The TResult is a MyObject when your expression is of singular result, like .FirstOrDefault(). However, the TResult can also be a double when you .Avg() over the query, and also it can be IEnumerable<MyObject> when your query is plain .Select.Where.
Proof(*) - I've just set a breakpoint inside my Execute() implementation, and I've inspected it with Watches:
typeof(TResult).FullName "System.Collections.Generic.IEnumerable`1[[xxxxxx,xxxxx]]"
expression.Type.FullName "System.Linq.IQueryable`1[[xxxxxx,xxxxx]]"
I admit that three overloads, one object, one TResult and one IEnumerable<TResult> would probably be more readable. I think they did not place three of them as extensibility point for future interfaces. I can imagine that in future they came up with something more robust than IEnumerable, and then they'd need to add another overload and so on. With simple this interface can process any type.
Oh, see, we now also have IQueryable in addition to IEnumerable, so it would need at least four overloads:)
The Proof is marked with (*) because I have had a small bug/feature in my IQueryProvider's code that has is obscuring the real behavior of LINQ.
LINQ indeed calls the generic Execute only for singular cases. This is a shortcut, an optimization.
For all other cases, it ... doesn't call Execute() it at all
For those all other cases, the LINQ calls .GetEnumerator on your custom IQueryable<> implementation, that what happens is dictated by .. simply what you wrote there. I mean, assuming that you actually provided custom implementations of IQueryable. That would be strange if you did not - that's just about 15 lines in total, nothing compared to the length of custom provider.
In the project where I got the "proof" from, my implementation looks like:
public System.Collections.IEnumerator GetEnumerator()
{
return Provider.Execute<IEnumerable>( this.Expression ).GetEnumerator();
}
public IEnumerator<TOut> GetEnumerator()
{
return Provider.Execute<IEnumerable<TOut>>( this.Expression ).GetEnumerator();
}
of course, one of them would be explicit due to name collision. Please note that to fetch the enumerator, I actually call the Execute with explicitely stated TResult. This is why in my "proof" those types occurred.
I think that you see the "TResult = Single Element" case, because you wrote i.e. something like this:
public IEnumerator<TOut> GetEnumerator()
{
return Provider.Execute<TOut>( this.Expression ).GetEnumerator();
}
Which really renders your Execute implementation without choice, and must return single element. IMHO, this is just a bug in your code. You could have done it like in my example above, or you could simply use the untyped Execute:
public System.Collections.IEnumerator GetEnumerator()
{
return ((IEnumerable)Provider.Execute( this.Expression )).GetEnumerator();
}
public IEnumerator<TOut> GetEnumerator()
{
return ((IEnumerable<TOut>)Provider.Execute( this.Expression )).GetEnumerator();
}
Of course, your implementation of Execute must make sure to return proper IEnumerables for such queries!
Expression trees that represent queries that return enumerable results are executed when their associated IQueryable object is enumerated.
I recommend enumerating your query:
foreach(T t in query)
{
CustomModification(t);
}
Your IQueryProvider must implement CreateQuery<T>. You get to choose the implemenation of the resulting IQueryable. If you want that IQueryable to do something to each row when enumerated, you get to write that implementation.
The final answer is that it's not possible.

LINQ Performance

What exactly is happening behind the scenes in a LINQ query against an object collection? Is it just syntactical sugar or is there something else happening making it more of an efficient query?
Do you mean in terms of a query expression, or what the query does behind the scenes?
Query expressions are expanded into "normal" C# first. For example:
var query = from x in source
where x.Name == "Fred"
select x.Age;
is translated to:
var query = source.Where(x => x.Name == "Fred")
.Select(x => x.Age);
The exact meaning of this depends on the type of source of course... in LINQ to Objects, it typically implements IEnumerable<T> and the Enumerable extension methods come into play... but it could be a different set of extension methods. (LINQ to SQL would use the Queryable extension methods, for example.)
Now, suppose we are using LINQ to Objects... after extension method expansion, the above code becomes:
var query = Enumerable.Select(Enumerable.Where(source, x => x.Name == "Fred"),
x => x.Age);
Next the implementations of Select and Where become important. Leaving out error checking, they're something like this:
public static IEnumerable<T> Where<T>(this IEnumerable<T> source,
Func<T, bool> predicate)
{
foreach (T element in source)
{
if (predicate(element))
{
yield return element;
}
}
}
public static IEnumerable<TResult> Select<TSource, TResult>
(this IEnumerable<TSource> source,
Func<TSource, TResult> selector)
{
foreach (TSource element in source)
{
yield return selector(element);
}
}
Next there's the expansion of iterator blocks into state machines, which I won't go into here but which I have an article about.
Finally, there's the conversion of lambda expressions into extra methods + appropriate delegate instance creation (or expression trees, depending on the signatures of the methods called).
So basically LINQ uses a lot of clever features of C#:
Lambda expression conversions (into delegate instances and expression trees)
Extension methods
Type inference for generic methods
Iterator blocks
Often anonymous types (for use in projections)
Often implicit typing for local variables
Query expression translation
However, the individual operations are quite simple - they don't perform indexing etc. Joins and groupings are done using hash tables, but straightforward queries like "where" are just linear. Don't forget that LINQ to Objects usually just treats the data as a forward-only readable sequence - it can't do things like a binary search.
Normally I'd expect hand-written queries to be marginally faster than LINQ to Objects as there are fewer layers of abstraction, but they'll be less readable and the performance difference usually won't be significant.
As ever for performance questions: when in doubt, measure!
If you need better performance, consider trying i4o - Index for Objects. It build in-memory objects for large collections (think 100,000+ rows), which LINQ then uses to speed up queries. You need a lot of data to make this work, but the improvements are impressive.
http://www.codeplex.com/i4o
It's just syntactic sugar - there's no magic involved.
You could write out the equivalent code in "longhand", in C# or whatever, and it would perform the same.
(The compiler will do a good job of producing efficient code, of course, so the code it produces might be a fraction more efficient than the code you would write yourself, simply because you might not know the most performant way to write that code.)

Resources