I have this query:
var iterator = criteria.binaryAssetBranchNodeIds.GetEnumerator();
iterator.MoveNext();
var binaryAssetStructures = from bas in db.BinaryAssetStructures
where bas.BinaryAssetStructureId == iterator.Current
select bas;
When I iterate over the binaryAssetStructureIds with a foreach loop no problems occur. When I try this
var binaryAssetStructure = binaryAssetStructures.ElementAt(0);
I get following error:
Unable to cast object of type 'System.Linq.Expressions.MethodCallExpression' to type 'SubSonic.Linq.Structure.ProjectionExpression'
First() for example does work... What am I missing here...
I don't know SubSonic at all, but FWIW a similar issue exists with the Entity Framework. In that case it boils down to the fact that there's no direct translation of ElementAt to SQL.
First() can be easily translated to SELECT TOP 1 FROM ... ORDER BY ..., but the same is not easily expressed for ElementAt.
You could argue that e.g. ElementAt(5) should be translated to SELECT TOP 5 FROM ... ORDER BY ... and then the first four elements simply discarded, but that doesn't work very well if you ask for ElementAt(100000).
In EF, you can partialle overcome this issue forcing the expression to be evaluated first, which can be done with calls to AsEnumerable, ToList or ToArray.
For example
var binaryAssetStructure = binaryAssetStructures.AsEnumerable().ElementAt(0);
I hope this helps although not explicitly directed at SubSonic.
Related
What is the explanation for EF downloading all result rows when AsEnumerable() is used?
What I mean is that this code:
context.Logs.AsEnumerable().Where(x => x.Id % 2 == 0).Take(100).ToList();
will download all the rows from the table before passing any row to the Where() method and there could be millions of rows in the table.
What I would like it to do, is to download only enough to gather 100 rows that would satisfy the Id % 2 == 0 condition (most likely just around 200 rows).
Couldn't EF do on demand loading of rows like you can with plain ADO.NET using Read() method of SqlDataReader and save time and bandwidth?
I suppose that it does not work like that for a reason and I'd like to hear a good argument supporting that design decision.
NOTE: This is a completely contrived example and I know normally you should not use EF this way, but I found this in some existing code and was just surprised my assumptions turned out to be incorrect.
The short answer: The reason for the different behaviors is that, when you use IQueryable directly, a single SQL query can be formed for your entire LINQ query; but when you use IEnumerable, the entire table of data must be loaded.
The long answer: Consider the following code.
context.Logs.Where(x => x.Id % 2 == 0)
context.Logs is of type IQueryable<Log>. IQueryable<Log>.Where is taking an Expression<Func<Log, bool>> as the predicate. The Expression represents an abstract syntax tree; that is, it's more than just code you can run. Think of it as being represented in memory, at runtime, like this:
Lambda (=>)
Parameters
Variable: x
Body
Equals (==)
Modulo (%)
PropertyAccess (.)
Variable: x
Property: Id
Constant: 2
Constant: 0
The LINQ-to-Entities engine can take context.Logs.Where(x => x.Id % 2 == 0) and mechanically convert it into a SQL query that looks something like this:
SELECT *
FROM "Logs"
WHERE "Logs"."Id" % 2 = 0;
If you change your code to context.Logs.Where(x => x.Id % 2 == 0).Take(100), the SQL query becomes something like this:
SELECT *
FROM "Logs"
WHERE "Logs"."Id" % 2 = 0
LIMIT 100;
This is entirely because the LINQ extension methods on IQueryable use Expression instead of just Func.
Now consider context.Logs.AsEnumerable().Where(x => x.Id % 2 == 0). The IEnumerable<Log>.Where extension method is taking a Func<Log, bool> as a predicate. That is only runnable code. It cannot be analyzed to determine its structure; it cannot be used to form a SQL query.
Entity Framework and Linq use lazy loading. It means (among other things) that they will not run the query until they need to enumerate the results: for instance using ToList() or AsEnumerable(), or if the result is used as an enumerator (in a foreach for instance).
Instead, it builds a query using predicates, and returns IQueryable objects to further "pre-filter" the results before actually returning them. You can find more infos here for instance. Entity framework will actually build a SQL query depending on the predicates you have passed it.
In your example:
context.Logs.AsEnumerable().Where(x => x.Id % 2 == 0).Take(100).ToList();
From the Logs table in the context, it fetches all, returns a IEnumerable with the results, then filters the result, takes the first 100, then lists the results as a List.
On the other hand, just removing the AsEnumerable solves your problem:
context.Logs.Where(x => x.Id % 2 == 0).Take(100).ToList();
Here it will build a query/filter on the result, then only once the ToList() is executed, query the database.
It also means that you can dynamically build a complex query without actually running it on the DB it until the end, for instance:
var logs = context.Logs.Where(a); // first filter
if (something) {
logs = logs.Where(b); // second filter
}
var results = logs.Take(100).ToList(); // only here is the query actually executed
Update
As mentionned in your comment, you seem to already know what I just wrote, and are just asking for a reason.
It's even simpler: since AsEnumerable casts the results to another type (a IQueryable<T> to IEnumerable<T> in this case), it has to convert all the results rows first, so it has to fetch the data first. It's basically a ToList in this case.
Clearly, you understand why it's better to avoid using AsEnumerable() the way you do in your question.
Also, some of the other answers have made it very clear why calling AsEnumerable() changes the way the query is performed and read. In short, it's because you are then invoking IEnumrable<T> extension methods rather than the IQueryable<T> extension methods, the latter allowing you to combine predicates before executing the query in the database.
However, I still feel that this doesn't answer your actual question, which is a legitimate question. You said (emphasis mine):
What I mean is that this code:
context.Logs.AsEnumerable().Where(x => x.Id % 2 == 0).Take(100).ToList();
will download all the rows from the table before passing any row to the Where() method and there could be millions of rows in the table.
My question to you is: what made you conclude that this is true?
I would argue that, because you are using IEnumrable<T> instead of IQueryable<T>, it's true that the query being performed in the database will be a simple:
select * from logs
... without any predicates, unlike what would have happened if you had used IQueryable<T> to invoke Where and Take.
However, the AsEnumerable() method call does not fetch all the rows at that moment, as other answers have implied. In fact, this is the implementation of the AsEnumerable() call:
public static IEnumerable<TSource> AsEnumerable<TSource>(this IEnumerable<TSource> source)
{
return source;
}
There is no fetching going on there. In fact, even the calls to IEnumerable<T>.Where() and IEnumerable<T>.Take() don't actually start fetching any rows at that moment. They simply setup wrapping IEnumerables that will filter results as they are iterated on. The fetching and iterating of the results really only begins when ToList() is called.
So when you say:
Couldn't EF do on demand loading of rows like you can with plain ADO.NET using Read() method of SqlDataReader and save time and bandwidth?
... again, my question to you would be: doesn't it do that already?
If your table had 1,000,000 rows, I would still expect your code snippet to only fetch up to 100 rows that satisfy your Where condition, and then stop fetching rows.
To prove the point, try running the following little program:
static void Main(string[] args)
{
var list = PretendImAOneMillionRecordTable().Where(i => i < 500).Take(10).ToList();
}
private static IEnumerable<int> PretendImAOneMillionRecordTable()
{
for (int i = 0; i < 1000000; i++)
{
Console.WriteLine("fetching {0}", i);
yield return i;
}
}
... when I run it, I only get the following 10 lines of output:
fetching 0
fetching 1
fetching 2
fetching 3
fetching 4
fetching 5
fetching 6
fetching 7
fetching 8
fetching 9
It doesn't iterate through the whole set of 1,000,000 "rows" even though I am chaining Where() and Take() calls on IEnumerable<T>.
Now, you do have to keep in mind that, for your little EF code snippet, if you test it using a very small table, it may actually fetch all the rows at once, if all the rows fit within the value for SqlConnection.PacketSize. This is normal. Every time SqlDataReader.Read() is called, it never only fetches a single row at a time. To reduce the amount of network call roundtrips, it will always try to fetch a batch of rows at a time. I wonder if this is what you observed, and this mislead you into thinking that AsEnumerable() was causing all rows to be fetched from the table.
Even though you will find that your example doesn't perform nearly as bad as you thought, this would not be a reason not to use IQueryable. Using IQueryable to construct more complex database queries will almost always provide better performance, because you can then benefit from database indexes, etc to fetch results more efficiently.
AsEnumerable() eagerly loads the DbSet<T> Logs
You probably want something like
context.Logs.Where(x => x.Id % 2 == 0).AsEnumerable();
The idea here is that you're applying a predicate filter to the collection before actually loading it from the database.
An impressive subset of the world of LINQ is supported by EF. It will translate your beautiful LINQ queries into SQL expressions behind the scenes.
I have come across this before.
The context command is not executed until a linq function is called, because you have done
context.Logs.AsEnumerable()
it has assumed you have finished with the query and therefore compiled it and returns all rows.
If you changed this to:
context.Logs.Where(x => x.Id % 2 == 0).AsEnumerable()
It would compile a SQL statement that would get only the rows where the id is modular 2.
Similarly if you did
context.Logs.Where(x => x.Id % 2 == 0).Take(100).ToList();
that would create a statement that would get the top 100...
I hope that helps.
LinQ to Entities has a store expression formed by all the Linq methods before It goes to an enumeration.
When you use AsEnumerable() and then Where() like this:
context.Logs.Where(...).AsEnumerable()
The Where() knows that the previous chain call has a store expression so he appends his predicate to It for lazy loading.
The overload of Where that is being called is different if you call this:
context.Logs.AsEnumerable().Where(...)
Here the Where() only knows that his previous method is an enumeration (it could be any kind of "enumerable" collection) and the only way that he can apply his condition is iterating over the collection with the IEnumerable implementation of the DbSet class, which must to retrieve the records from the database first.
I don't think you should ever use this:
context.Logs.AsEnumerable().Where(x => x.Id % 2 == 0).Take(100).ToList();
The correct way of doing things would be:
context.Logs.AsQueryable().Where(x => x.Id % 2 == 0).Take(100).ToList();
Answer with explanations here:
What's the difference(s) between .ToList(), .AsEnumerable(), AsQueryable()?
Why use AsQueryable() instead of List()?
I'm using NHibernate 3.2 and I have a repository method that looks like:
public IEnumerable<MyModel> GetActiveMyModel()
{
return from m in Session.Query<MyModel>()
where m.Active == true
select m;
}
Which works as expected. However, sometimes when I use this method I want to filter it further:
var models = MyRepository.GetActiveMyModel();
var filtered = from m in models
where m.ID < 100
select new { m.Name };
Which produces the same SQL as the first one and the second filter and select must be done after the fact. I thought the whole point in LINQ is that it formed an expression tree that was unravelled when it's needed and therefore the correct SQL for the job could be created, saving my database requests.
If not, it means all of my repository methods have to return exactly what is needed and I can't make use of LINQ further down the chain without taking a penalty.
Have I got this wrong?
Updated
In response to the comment below: I omitted the line where I iterate over the results, which causes the initial SQL to be run (WHERE Active = 1) and the second filter (ID < 100) is obviously done in .NET.
Also, If I replace the second chunk of code with
var models = MyRepository.GetActiveMyModel();
var filtered = from m in models
where m.Items.Count > 0
select new { m.Name };
It generates the initial SQL to retrieve the active records and then runs a separate SQL statement for each record to find out how many Items it has, rather than writing something like I'd expect:
SELECT Name
FROM MyModel m
WHERE Active = 1
AND (SELECT COUNT(*) FROM Items WHERE MyModelID = m.ID) > 0
You are returning IEnumerable<MyModel> from the method, which will cause in-memory evaluation from that point on, even if the underlying sequence is IQueryable<MyModel>.
If you want to allow code after GetActiveMyModel to add to the SQL query, return IQueryable<MyModel> instead.
You're running IEnumerable's extension method "Where" instead of IQueryable's. It will still evaluate lazily and give the same output, however it evaluates the IQueryable on entry and you're filtering the collection in memory instead of against the database.
When you later add an extra condition on another table (the count), it has to lazily fetch each and every one of the Items collections from the database since it has already evaluated the IQueryable before it knew about the condition.
(Yes, I would also like to be the extensive extension methods on IEnumerable to instead be virtual members, but, alas, they're not)
I am new to LINQ queries and to EF too, I usually work with MySQL and I can't guess how to write really simples queries.
I'd like to select all results from a table. So, I used like this:
ZXContainer db = new ZXContainer();
ViewBag.ZXproperties = db.ZXproperties.All();
But I see that I have to write something inside All(---).
Could someone guide me in how could I do that? And if someone has any good link for references too, I thank so much.
All() is an boolean evaluation performed on all of the elements in a collection (though immediately returns false when it reaches an element where the evaluation is false), for example, you want to make sure that all of said ZXproperties have a certain field set as true:
bool isTrue = db.ZXproperties.All(z => z.SomeFieldName == true);
Which will either make isTrue true or false. LINQ is typically lazy-loading, so if you're calling db.ZXproperties directly, you have access to all of the objects as is, but it isn't quite what you're looking for. You can either load all of the objects at the variable assignment with an .ToList():
ViewBag.ZXproperties = db.ZXproperties.ToList();
or you can use the below expression:
ViewBag.ZXproperties = from s in db.ZXproperties
select s;
Which is really no different than saying:
ViewBag.ZXproperties = db.ZXproperties;
The advantage of .ToList() is that if you are wanting to do multiple calls on this ViewBag.ZXproperties, it will only require the initial database call when it is assigning the variable. Alternatively, if you do any form of queryable action on the data, such as .Where(), you'll have another query performed, which is less than ideal if you already have the data to work with.
To select everything, just skip the .All(...), as ZXproperties allready is a collection.
ZXContainer db = new ZXContainer();
ViewBag.ZXproperties = db.ZXproperties;
You might want (or sometimes even need) to call .ToList() on this collection before use...
You don't use All. Just type
ViewBag.ZXproperties = db.ZXproperties;
or
ViewBag.ZXproperties = db.ZXproperties.ToList();
The All method is used to determine if all items of collection match some condition.
If you just want all of the items, you can just use it directly:
ViewBag.ZXproperties = db.ZXproperties;
If you want this evaluated immediately, you can convert it to a list:
ViewBag.ZXproperties = db.ZXproperties.ToList();
This will force it to be pulled across the wire immediately.
You can use this:
var result = db.ZXproperties.ToList();
For more information on linq see 101 linq sample.
All is some checking on all items and argument in it, called lambda expression.
DynamicObject LINQ query with the List compiles fine:
List<string> list = new List<string>();
var query = (from dynamic d in list where d.FirstName == "John" select d);
With our own custom class that we use for the "usual" LINQ compiler reports the error "An expression tree may not contain a dynamic
operation":
DBclass db = new DBclass();
var query = (from dynamic d in db where d.FirstName == "John" select d);
What shall we add to handle DynamicObject LINQ?
Does DBClass implement IEnumerable? Perhaps there is a method on it you should be calling to return an IEnumerable collection?
You could add a type, against which to write the query.
I believe your problem is, that in the first expression, where you are using the List<>, everything is done in memory using IEnumerable & Link-to-Objects.
Apparently, your DBClass is an IQueryable using Linq-to-SQL. IQueryables use an expression tree to build an SQL statement to send to the database.
In other words, despite looking much alike, the two statements are doing radically different things, one of which is allowed & one which isn't. (Much in the way var y = x * 5; will either succeed or fail depending on if x is an int or a string).
Further, your first example may compile, but as far as I can tell, it will fail when you run it. That's not a particular good benchmark for success.
The only way I see this working is if the query using dynamic is made on IEnumerables using Link-to-Objects. (Load the full table into a List, and then query on the list)
Why do I get the error:
Unable to create a constant value of type 'Closure type'. Only
primitive types (for instance Int32, String and Guid) are supported in
this context.
When I try to enumerate the following Linq query?
IEnumerable<string> searchList = GetSearchList();
using (HREntities entities = new HREntities())
{
var myList = from person in entities.vSearchPeople
where upperSearchList.All( (person.FirstName + person.LastName) .Contains).ToList();
}
Update:
If I try the following just to try to isolate the problem, I get the same error:
where upperSearchList.All(arg => arg == arg)
So it looks like the problem is with the All method, right? Any suggestions?
It looks like you're trying to do the equivalent of a "WHERE...IN" condition. Check out How to write 'WHERE IN' style queries using LINQ to Entities for an example of how to do that type of query with LINQ to Entities.
Also, I think the error message is particularly unhelpful in this case because .Contains is not followed by parentheses, which causes the compiler to recognize the whole predicate as a lambda expression.
I've spent the last 6 months battling this limitation with EF 3.5 and while I'm not the smartest person in the world, I'm pretty sure I have something useful to offer on this topic.
The SQL generated by growing a 50 mile high tree of "OR style" expressions will result in a poor query execution plan. I'm dealing with a few million rows and the impact is substantial.
There is a little hack I found to do a SQL 'in' that helps if you are just looking for a bunch of entities by id:
private IEnumerable<Entity1> getByIds(IEnumerable<int> ids)
{
string idList = string.Join(",", ids.ToList().ConvertAll<string>(id => id.ToString()).ToArray());
return dbContext.Entity1.Where("it.pkIDColumn IN {" + idList + "}");
}
where pkIDColumn is your primary key id column name of your Entity1 table.
BUT KEEP READING!
This is fine, but it requires that I already have the ids of what I need to find. Sometimes I just want my expressions to reach into other relations and what I do have is criteria for those connected relations.
If I had more time I would try to represent this visually, but I don't so just study this sentence a moment: Consider a schema with a Person, GovernmentId, and GovernmentIdType tables. Andrew Tappert (Person) has two id cards (GovernmentId), one from Oregon (GovernmentIdType) and one from Washington (GovernmentIdType).
Now generate an edmx from it.
Now imagine you want to find all the people having a certain ID value, say 1234567.
This can be accomplished with a single database hit with this:
dbContext context = new dbContext();
string idValue = "1234567";
Expression<Func<Person,bool>> expr =
person => person.GovernmentID.Any(gid => gid.gi_value.Contains(idValue));
IEnumerable<Person> people = context.Person.AsQueryable().Where(expr);
Do you see the subquery here? The generated sql will use 'joins' instead of sub-queries, but the effect is the same. These days SQL server optimizes subqueries into joins under the covers anyway, but anyway...
The key to this working is the .Any inside the expression.
I have found the cause of the error (I am using Framework 4.5). The problem is, that EF a complex type, that is passed in the "Contains"-parameter, can not translate into an SQL query. EF can use in a SQL query only simple types such as int, string...
this.GetAll().Where(p => !assignedFunctions.Contains(p))
GetAll provides a list of objects with a complex type (for example: "Function"). So therefore, I would try here to receive an instance of this complex type in my SQL query, which naturally can not work!
If I can extract from my list, parameters which are suited to my search, I can use:
var idList = assignedFunctions.Select(f => f.FunctionId);
this.GetAll().Where(p => !idList.Contains(p.FunktionId))
Now EF no longer has the complex type "Function" to work, but eg with a simple type (long). And that works fine!
I got this error message when my array object used in the .All function is null
After I initialized the array object, (upperSearchList in your case), the error is gone
The error message was misleading in this case
where upperSearchList.All(arg => person.someproperty.StartsWith(arg)))