Deadloop after using orderby - linq

I want to get a list of files in a folder in which the files are named as 0.html, 1.html, 2.html,... 10.html, 11.html.....
I want to sort them by number, not by preceding number.
so I write the query
var SeedPages = from pages in Directory.GetFiles(DownloadFolderString) orderby pages.Length select pages;
when I access SeedPages.First(), it keeps looping at pages.Length
I don't understand why the program goes back to the query statement.

The execution of this LINQ statement is deferred, you can put all resulting values in a list or array by calling .ToList() or ToArray() on your LINQ statement.

Related

How can I use Math.X functions with LINQ?

I have a simple table (SQL server and EF6) Myvalues, with columns Id & Value (double)
I'm trying to get the sum of the natural log of all values in this table. My LINQ statement is:
var sum = db.Myvalues.Select(x => Math.Log(x.Value)).Sum();
It compiles fine, but I'm getting a RTE:
LINQ to Entities does not recognize the method 'Double Log(Double)' method, and this method cannot be translated into a store expression.
What am I doing wrong/how can I fix this?
FWIW, I can execute the following SQL query directly against the database which gives me the correct answer:
select exp(sum(LogCol)) from
(select log(Myvalues.Value) as LogCol From Myvalues
) results
LINQ tries to translate Math.Log into a SQL command so it is executed against the DB.
This is not supported.
The first solution (for SQL Server) is to use one of the existing SqlFunctions. More specifically, SqlFunctions.Log.
The other solution is to retrieve all your items from your DB using .ToList(), and execute Math.Log with LINQ to Objects (not LINQ to Entities).
As EF cannot translate Math.Log() you could get your data in memory and execute the function form your client:
var sum = db.Myvalues.ToList().Select(x => Math.Log(x.Value)).Sum();

How do I filter in linq query when field needs parsing first?

I have a data table containing multiple columns and one column that stores somewhat complex text patterns - I need to parse the field to determine if a particular sub strings exist in specific positions within the larger string pattern and then if the record should be filtered out as a result.
I can't see a way to perform the parse other than by writing a C# parsing function with String.Split method calls, foreach, etc. But if I try to parse like this:
var myFilteredTable = _db.MyTable.Where(t => t.Column1 == 'Filter1'
&& ParseIsMyItemInColumn2(t) );
I get "has no supported translation to SQL" errors.
The other option I thought of was to build the initial result without the Parse:
var myFilteredTable = _db.MyTable.Where(t => t.Column1 == 'Filter1' );
and iterate through the IQueryable resultset, testing each row with the parse function, to filter out the unwanted rows, but IQueryable does not have Remove function to strip out unwanted rows nor Add function to allow me to build up a new resultset.
So how can I filter in linq when I also need to write a Parse function?
Well the "initial filter in the database then do the rest locally" is easy:
var filtered = _db.MyTable.Where(t => t.Column1 == "Filter1")
.AsEnumerable() // Do the rest locally
.Where(t => ParseIsMyItemInColumn2(t));
AsEnumerable is a simple pass through method, but because the result is typed as IEnumerable<T> rather than IQueryable<T>, the subsequent LINQ operations use the LINQ to Objects methods in Enumerable rather than the ones in Queryable.
Obviously if a lot of items match the first filter but fail the second, that won't be terribly efficient...
Unfortunately, if the "parse function" is not something that can be translated to SQL, you will need to pull the results and use LINQ to Objects:
var myFilteredTable = _db.MyTable.Where(t => t.Column1 == 'Filter1')
.AsEnumerable().Where(ParseIsMyItemInColumn2);
Note that this will stream all of the results into memory, and then perform your parse.

NHibernate IQueryable doesn't seem to delay execution

I'm using NHibernate 3.2 and I have a repository method that looks like:
public IEnumerable<MyModel> GetActiveMyModel()
{
return from m in Session.Query<MyModel>()
where m.Active == true
select m;
}
Which works as expected. However, sometimes when I use this method I want to filter it further:
var models = MyRepository.GetActiveMyModel();
var filtered = from m in models
where m.ID < 100
select new { m.Name };
Which produces the same SQL as the first one and the second filter and select must be done after the fact. I thought the whole point in LINQ is that it formed an expression tree that was unravelled when it's needed and therefore the correct SQL for the job could be created, saving my database requests.
If not, it means all of my repository methods have to return exactly what is needed and I can't make use of LINQ further down the chain without taking a penalty.
Have I got this wrong?
Updated
In response to the comment below: I omitted the line where I iterate over the results, which causes the initial SQL to be run (WHERE Active = 1) and the second filter (ID < 100) is obviously done in .NET.
Also, If I replace the second chunk of code with
var models = MyRepository.GetActiveMyModel();
var filtered = from m in models
where m.Items.Count > 0
select new { m.Name };
It generates the initial SQL to retrieve the active records and then runs a separate SQL statement for each record to find out how many Items it has, rather than writing something like I'd expect:
SELECT Name
FROM MyModel m
WHERE Active = 1
AND (SELECT COUNT(*) FROM Items WHERE MyModelID = m.ID) > 0
You are returning IEnumerable<MyModel> from the method, which will cause in-memory evaluation from that point on, even if the underlying sequence is IQueryable<MyModel>.
If you want to allow code after GetActiveMyModel to add to the SQL query, return IQueryable<MyModel> instead.
You're running IEnumerable's extension method "Where" instead of IQueryable's. It will still evaluate lazily and give the same output, however it evaluates the IQueryable on entry and you're filtering the collection in memory instead of against the database.
When you later add an extra condition on another table (the count), it has to lazily fetch each and every one of the Items collections from the database since it has already evaluated the IQueryable before it knew about the condition.
(Yes, I would also like to be the extensive extension methods on IEnumerable to instead be virtual members, but, alas, they're not)

no supported translation to SQL Help Linq -> SQL

I have a query that has a where clause that checks if the data element is contained within a list.
This query executes fine:
results = awardedStats.Where(r => guidReq.Contains(r.RequirementId) || orgAcr.Contains(r.Division))
.Select(r => r);
however this does not:
results = awardedStats.Where(r => guidReq.Contains(r.RequirementId) || orgAcrId.Contains(r.guidDivisionId))
.Select(r => r);
r.division is a string and orgAcr is a List
r.guidDivisionId is a Guid and orgAcrId is a List
I know that each list get the correct values, I can check the list in the debugger, but if I run the first query, everything runs through fine, if I run the second query I get an error stating that the member "guidDivisionId" has no supported translation to SQL
If orgAcrId is a List<Guid> and r.guidDivisionId is a uniqueidentifier column in SQL Server, this should be fine. Are you sure the column name isn't r.DivisionId?
Get all data from sql and call AsEnumarable() method on it and then apply the where. That way comparison would be done in memory and it won't complain about sql translation.
Another thing is that when you use contains, it's converted to Sql IN clause. All elements in the list are included in the IN clause. If your list has more than 2100 elements, then you would get sql exception saying that sql cannot accept more than 2100 parameters. Doing this kind of comparison in memory is safer.

LINQ - Deferred Execution in Subqueries

My understanding is that the use of scalar or conversion functions causes immediate execution of a LINQ query. It is also my understanding that subqueries are executed upon demand of the outer query which would typically be once per element. For the following example would I be right in saying that the inner query is executed immediately? If so, as this would produce a scalar value how would this affect how the outer query operates?
IEnumerable<string> outerQuery = names.Where ( n => n.Length == names
.OrderBy(n2 => n2.Length).Select(n2 => n2.Length).First());
I would expect the above query to operate in a similar way as below, ie as if there wasn't a subquery.
int val = names.OrderBy(n2 => n2.Length).Select(n2 => n2.Length).First();
IEnumerable<string> outerQuery = names.Where ( n => n.Length == val );
This example was taken from Joseph and Ben Albahari's C# 4.0 in a Nutshell (Chp 8 P331/332) and my confusion stems from the accompanying diagram which appears to show that the subquery is being evaluated each time the outer query iterates through the elements of names.
Could someone clarify how LINQ works in this setup? Any help would be appreciated!
For the following example would I be right in saying that the inner query is executed immediately?
No, the inner query will be executed for each item in names when the outer query is enumerated. If you want it to be executed only once, use the second code sample.
EDIT: as LukeH pointed out, this is only true of Linq to Objects. Other Linq providers (Linq to SQL, Entity Framework...) might be able to optimize this automatically
What is names? If it's collection (and you use LINQ to Objects) then "subquery" will be executed for each outer query item. If it's actually query object then result depends on actual IQueryable.Provider. For example, for LINQ to SQL you will give SQL query with scalar subquery. And in the most cases subquery actually will be executed only once.

Resources