Sorting IQueryable by Aggregate in VB.net - linq

been searching for a quick example of sorting a IQueryable (Using Linq To SQL) using a Aggregate value.
I basically need to calculate a few derived values (Percentage difference between two values etc) and sort the results by this.
i.e.
return rows.OrderBy(Function(s) CalcValue(s.Visitors, s.Clicks))
I want to call an external function to calculate the Aggregate. Should this implement IComparer? or IComparable?
thanks
[EDIT]
Have tried to use:
Public Class SortByCPC : Implements IComparer(Of Statistic)
Public Function Compare(ByVal x As Statistic, ByVal y As Statistic) As Integer Implements System.Collections.Generic.IComparer(Of Statistic).Compare
Dim xCPC = x.Earnings / x.Clicks
Dim yCPC = y.Earnings / y.Clicks
Return yCPC - xCPC
End Function
End Class
LINQ to SQL doesn't like me using IComparer

LINQ to SQL is never going to like you using your own methods within a query - it can't see inside them and work out what you want the SQL to look like. It can only see inside expression trees, built up from lambda expressions in the query.
What you want is something like:
Dim stats = From x in db.Statistics
Where (something, if you want filtering)
Order By x.Earnings / x.Clicks;
If you really want to fetch all of the results and then order them, you need to indicate to LINQ that you're "done" with the IQueryable side of things - call AsEnumerable() and then you can do any remaining processing on the client. It's better to get the server to do as much as possible though.

My VB is pretty bad, but I think this is what it should look like. This assumes that CalcValues returns a double and the type of rows is RowClass. This example does not use the IComparer version of the OrderBy extension but relies on the fact the doubles are comparable already and returns the CalcValue (assumed as double) as the key.
Dim keySelector As Func(Of Double, RowClass) = _
Func( s As RowClass) CalcValue( s.Visitors, s.Clicks )
return rows.OrderBy( keySelector )
Here are some links you might find useful.
IQueryable.OrderBy extension method
Lambda expressions for Visual Basic

My solution:
Dim stats = rows.OrderBy(Function(s) If(s.Visitors > 0, s.Clicks / s.Visitors, 0))
This also catches any divide by zero exceptions

Related

Late binding with COM objects (flexgrid) is 2 times slower than early binding

Public Function gridloop(MSFG1 As Object) As Long
For i= 0 To MSFG1.rows - 1
A = MSFG1.TextMatrix(i,1)
Next
End Function
The above code is 2 times slower than below
Public Function gridloop(MSFG1 As MSHFlexGrid) As Long
Public Function gridloop(MSFG1 As MSFlexGrid) As Long
Any solution to speed-up?
Not a lot of details in the question, I presume you have two (or more?) different controls where you're trying to essentially overload your gridloop function so it'll work with multiple types of controls?
The following might provide a performance improvement. I have not tested this, not even confirmed that it is free of compile errors. Idea is to determine the control type, then assign it to a variable of a matching type, then the references to the methods and properties might be early bound (thus faster).
Public Function gridloop(MSFG1 as Object) as Long
Dim myMSHFlexGrid As MSHFlexGrid
Dim myMSFlexGrid As MSFlexGrid
Dim i As Integer
Dim A As Long
If TypeOf MSFG1 Is MSHFlexGrid Then
Set myMSHFlexGrid = MSFG1
For i = 0 To myMSHFlexGrid.rows - 1
A = myMSHFlexGrid.TextMatrix(i,1)
Next
ElseIf TypeOf MSFG1 Is MSFlexGrid Then
Set myMSFlexGrid = MSFG1
For i = 0 To myMSFlexGrid.rows - 1
A = myMSFlexGrid.TextMatrix(i,1)
Next
End If
End Function
Alternative is to define two gridloop functions, one for each type. A form of manual overloading.
Public Function gridloop_MSHFlexGrid(MSFG1 As MSHFlexGrid) As Long
Public Function gridloop_MSFlexGrid(MSFG1 As MSFlexGrid) As Long
Advantage to this is that trying to call one of the gridloop functions with an 'incorrect' control will result in a compile error - catching a problem early that could otherwise require spending some significant time performing runtime debugging.
Building on MarkL's answer, you could use actual VB interface overloading to get what you want.
The idea would be to create an interface exposing whatever properties or functions you need on the grids, and then create two classes, each one implementing that interface and internally manipulating the actual grid.
This wrappering substitutes for the fact that the two grid types do not intrinsically share a common interface. (I looked in the IDL using OLEView).
You can then use the interface as the type in every location you currently are using Object to stand in for the actual grid class. If the interface is comprehensive AND its methods / properties are named appropriately then you would not need to make any other code changes.
Sample (pseudo)code...
Interface:
'In file for interface IGridWrapper
Function Rows
End Function
Function TextMatrix(i as Integer, j as Integer)
End Function
'Any others...
Wrapper class 1:
' In file for class "MSFlexGridWrapper"
Implements IGridWrapper
Private m_grid as MSFlexGrid
Sub Init(MSFlexGrid grid)
Set m_grid = grid
End Sub
Function IGridWrapper_Rows
IGridWrapper_RowCount = m_grid.Count
End Function
Function IGridWrapper_Textmatrix(i as Integer, j as Integer)
'etc.
End Function
'Any others...
Wrapper class 2:
' In file for class "MSHFlexGridWrapper"
Implements IGridWrapper
Private m_grid as MSHFlexGrid
Sub Init(MSHFlexGrid grid)
Set m_grid = grid
End Sub
Function IGridWrapper_Rows
IGridWrapper_RowCount = m_grid.Count
End Function
Function IGridWrapper_Textmatrix(i as Integer, j as Integer)
'etc.
End Function
'Any others...
Code using the wrappers:
Public Function gridloop(MSFG1 As IGridWrapper) As Long
(Note - none of this has been put through a compiler for exact syntax checking)
The basic reason that late binding (binds at runtime) is slower than early binding (binds at compile time) is that you have to use the iDispatch interface's GetIDsOfNames and Invoke methods to access properties in the object's interface, rather than accessing them directly from the vtable. For more information, have a look at this.
The reason that DaveInCaz's and MarkL's suggestions will probably speed things up is that they are ways to allow your gridloop function to accept a type that can be bound early rather than an Object type. DaveInCaz's solution is also a fine example of a practical application of polymorphism.

c# generic orderby

In my base-repository class
i wrote this function to make possible to retrive a sorted data collection from the DB.
T is a generic defined at Class level
public abstract class RepositoryBase<T>
where T : class
The code is this:
public IList<T> GetAll<TKey>(Expression<Func<T, bool>> whereCondition, Expression<Func<T, TKey>> sortCondition, bool sortDesc = false)
{
if (sortDesc)
return this.ObjectSet.Where(whereCondition).OrderByDescending(sortCondition).ToList<T>();
return this.ObjectSet.Where(whereCondition).OrderBy(sortCondition).ToList<T>() ;
}
My goal was to introduce a generic sort parameter so that i could call the function in this way:
repo.GetAll (model=>model.field>0, model=>model.sortableField, true)
i mean that i could specify the sorting field directly via anonymous function and so using Intellisense...
Unfortunately this function doesn't work as the last code line generate errors at compile time.
I tried also to call:
repo.GetAll<Model> (model=>model.field>0, model=>model.sortableField, true)
but this don't work.
How should i write the function to meet my goal?
i'm working with EF 5, c#, .NET 4.5
You're using ObjectSet which implements IQueryable<T>. That is extended by methods on System.Linq.Queryable, which accept Expression<Func< parameters. It is correct to use those Expression parameters, as you intend for execution to occur in the database, not locally.
A Func is an anonymous delegate, a .net method.
An Expression is a tree, which may be compiled into a Func, or may be translated into Sql or something else.
You showed us a really abstract use of the method, but not an actual use of the method, or the compiler error. I suspect the error you may be making is confusing the two type parameters.
You said:
repo.GetAll<Model> (model=>model.field>0, model=>model.sortableField, true)
But this generic parameter for this method represents the type of sortableField. If sortableField isn't a Model - this is wrong.
Instead, you should be doing something like this:
Repository<Person> myRepo = new Repository<Person>();
myRepo.GetAll<DateTime>(p => p.Friends.Count() > 3, p => p.DateOfBirth, true);
If specifying the sort type breaks your intended pattern of usage, consider hiding that key by using an IOrderer: Store multi-type OrderBy expression as a property

What is does expression<T> do?

What does Expression<T> do?
I have seen it used in a method similar to:
private Expression<Func<MyClass,bool>> GetFilter(...)
{
}
Can't you just return the Func<MyClass,bool> ?
Google and SO searches have failed me due to the < and > signs.
If TDelegate represents a delegate type, then Expression<TDelegate> represents a lambda expression that can be converted to a delegate of type TDelegate as an expression tree. This allows you to programatically inspect a lambda expression to extract useful information.
For example, if you have
var query = source.Where(x => x.Name == "Alan Turing");
then x => x.Name == "Alan Turning" can be inspected programatically if it's represented as an expression tree, but not so much if it's thought of as a delegate. This is particularly useful in the case of LINQ providers which will walk the expression tree to convert the lambda expression into a different representation. For example, LINQ to SQL would convert the above expression tree to
SELECT * FROM COMPUTERSCIENTIST WHERE NAME = 'Alan Turing'
It can do that because of the representation of the lambda expression as a tree whose nodes can be walked and inspected.
An Expression allows you to inspect the structure of the code inside of the delegate rather than just storing the delegate itself.
As usual, MSDN is pretty clear on the matter:
MSDN - Expression(TDelegate)
Yes, Func<> can be used in place of place of an Expression. The utility of an expression tree is that it gives remote LINQ providers such as LINQ to SQL the ability to look ahead and see what statements are required to allow the query to function. In other words, to treate code as data.
//run the debugger and float over multBy2. It will be able to tell you that it is an method, but it can't tell you what the implementation is.
Func<int, int> multBy2 = x => 2 * x;
//float over this and it will tell you what the implmentation is, the parameters, the method body and other data
System.Linq.Expressions.Expression<Func<int, int>> expression = x => 2 * x;
In the code above you can compare what data is available via the debugger. I invite you to do this. You will see that Func has very little information available. Try it again with Expressions and you will see a lot of information including the method body and parameters are visible at runtime. This is the real power of Expression Trees.

Are LINQ-returned IEnumerables "cast-hack" free?

Are IEnumerables returned by LINQ methods such as Select or SelectMany "cast-hack" free? For instance, you can return from a function whose output type is IEnumerable an IList, but if you cast it back to IList you will be able to modify it. Does the same happen with IEnumerables returned by LINQ?
Yes. The LINQ methods return special iterator collection classes that wrap the original data source or employ the yield keyword. The reason is deferred execution.
For example:
Select and Where return an instance of the private class WhereSelectEnumerableIterator<TSource, TResult>.
Except and Distinct use the yield keyword to return the elements from the collections that match the condition.
You can use ILSpy to have a look at this code yourself.

LINQ Performance

What exactly is happening behind the scenes in a LINQ query against an object collection? Is it just syntactical sugar or is there something else happening making it more of an efficient query?
Do you mean in terms of a query expression, or what the query does behind the scenes?
Query expressions are expanded into "normal" C# first. For example:
var query = from x in source
where x.Name == "Fred"
select x.Age;
is translated to:
var query = source.Where(x => x.Name == "Fred")
.Select(x => x.Age);
The exact meaning of this depends on the type of source of course... in LINQ to Objects, it typically implements IEnumerable<T> and the Enumerable extension methods come into play... but it could be a different set of extension methods. (LINQ to SQL would use the Queryable extension methods, for example.)
Now, suppose we are using LINQ to Objects... after extension method expansion, the above code becomes:
var query = Enumerable.Select(Enumerable.Where(source, x => x.Name == "Fred"),
x => x.Age);
Next the implementations of Select and Where become important. Leaving out error checking, they're something like this:
public static IEnumerable<T> Where<T>(this IEnumerable<T> source,
Func<T, bool> predicate)
{
foreach (T element in source)
{
if (predicate(element))
{
yield return element;
}
}
}
public static IEnumerable<TResult> Select<TSource, TResult>
(this IEnumerable<TSource> source,
Func<TSource, TResult> selector)
{
foreach (TSource element in source)
{
yield return selector(element);
}
}
Next there's the expansion of iterator blocks into state machines, which I won't go into here but which I have an article about.
Finally, there's the conversion of lambda expressions into extra methods + appropriate delegate instance creation (or expression trees, depending on the signatures of the methods called).
So basically LINQ uses a lot of clever features of C#:
Lambda expression conversions (into delegate instances and expression trees)
Extension methods
Type inference for generic methods
Iterator blocks
Often anonymous types (for use in projections)
Often implicit typing for local variables
Query expression translation
However, the individual operations are quite simple - they don't perform indexing etc. Joins and groupings are done using hash tables, but straightforward queries like "where" are just linear. Don't forget that LINQ to Objects usually just treats the data as a forward-only readable sequence - it can't do things like a binary search.
Normally I'd expect hand-written queries to be marginally faster than LINQ to Objects as there are fewer layers of abstraction, but they'll be less readable and the performance difference usually won't be significant.
As ever for performance questions: when in doubt, measure!
If you need better performance, consider trying i4o - Index for Objects. It build in-memory objects for large collections (think 100,000+ rows), which LINQ then uses to speed up queries. You need a lot of data to make this work, but the improvements are impressive.
http://www.codeplex.com/i4o
It's just syntactic sugar - there's no magic involved.
You could write out the equivalent code in "longhand", in C# or whatever, and it would perform the same.
(The compiler will do a good job of producing efficient code, of course, so the code it produces might be a fraction more efficient than the code you would write yourself, simply because you might not know the most performant way to write that code.)

Resources