I am aware of few efforts in constructing Linq queries dynamically, such as this, and this.
None seem to be ideal as I would like to avoid putting expressions in a string, and omitting a where if it is not needed.
My main concern is that the query is optimized for the database, and dynamically omits unnecessary clauses whenever possible.
Are there any new developments in EF 4.0 for such scenarios?
UPDATE
here is one link i found very helpful:
http://www.albahari.com/nutshell/predicatebuilder.aspx
indeed, adding "And" filters dynamically is trivial, and adding "Or" filters can be done easily using predicate builder:
var predicate = PredicateBuilder.False<Product>();
predicate = predicate.Or (p => p.Description.Contains (temp));
and according to LinqPad the sql gets emitted accordingly to what filters were applied..
For omitting the Where cause (pseudocode, hope I understood your question correctly):
var query = IQueryable<Foo>();
if(someCondition)
query = query.Where(......);
var result = query.Select(.......);
For dynamic queries - I haven't heard about anything new. IMHO we will have to stay with strings. Can you come up with some better approach?
Related
Can anyone please help me with a general Entity Framework question? I'm a newbie and trying to teach myself from reading and trial & error. However, I'm getting REALLY confused on all the syntax and terminology. And the more I google, the more confused I get!
What in the world are those little arrows (=>) used in the syntax? And I'm not even sure what the name of the syntax is...is it Entity Framework syntax? Linq to method syntax? Linq to Entity syntax?
Why does it seem like you can use random letters when using that syntax? the "f" below seems interchangeable with any alphabet letter since Intellisense gives me options no matter what letter I type. So what is that letter supposed to stand for anyway? There seems to be no declaration for it.
var query = fruits.SelectMany(f => f.Split(' '));
Is it better to use the syntax with the little arrows or to use the "psuedo SQL" that I keep seeing, like below. This seems a little easier to understand, but is this considered not the Real Entity Framework Way?
var query = from f in fruits from word in f.Split(' ') select word;
And, for any of them - is there any documentation out there ANYWHERE?? I've been scouring the internet for tutorials, articles, anything, but all that comes back are small sample queries varying with the little arrows or that psuedo SQL, with no explanations beyond "here's how to do a select:"
I would much appreciate any guidance or assistance. I think if I can just find out where to start, then I can build myself from there. Thanks!
There is no real entity way, there is LINQ and there is LINQ extension methods which is my opinion is much cleaner to the eyes. Also you can use LINQ not just with EE.
Language Integrated Query
LINQ extends the language by the addition of query expressions, which are akin to SQL statements, and can be used to conveniently extract and process data from arrays, enumerable classes, XML documents, relational databases, and third-party data sources. Other uses, which utilize query expressions as a general framework for readably composing arbitrary computations, include the construction of event handlers2 or monadic parsers.3
1 It is called lambda expression and it is basically an anonymous method.
Exploring Lambda Expression in C#
2 You can use anything you want, word, or letters, anything that is a valid name for a parameter, because that is a parameter
3 I find the LINQ extension methods to be cleaner, and to be honest the last I want to see is SQL like statements laying in the code.
4 A good start can be found here
101 LINQ SAMPLES
The arrow is called a Lambda operator, and it's used to create Lambda expressions. This has nothing to do with EF, or Linq or anything else. It's a feature of C#. EF and Linq just use this feature a lot because it's very useful for writing queries.
Marco has given links to the relevant documentation.
Linq is a library of extension methods that primarily operate on types like IEnumerable and IQueryable interfaces, and give you a lot of power to work with collections of various types. You can write Linq queries either in two formats, so called Method syntax and Query Syntax. They are functionally identical, but their usage is generally a matter of personal preference which one you use (although many of us use both, depending on the context it's used in.. one or the other is easier to use).
I'm looking for confirmation/clarification with these LINQ expressions:
var context = new SomeCustomDbContext()
// LINQ to Entities?
var items = context.CustomItems.OrderBy(i => i.Property).ToList();
// LINQ to Objects?
var items2 = context.CustomItems.ToList().OrderBy(i => i.Property);
Am I correct in thinking the first method is LINQ to Entities where EF builds a more specific SQL statement to pass on, putting the ordering effort on on the database?
Is the second method LINQ to Objects where LINQ drags the whole collection into memory (the ToList() enumeration?) before ordering thus leaving the burden on the server side (the web server in this case)?
If this is the case, I can quickly see situations where L2E would be advantageous (ex. filtering/trimming collections before pulling them into memory).
But are there any other details/trade-offs I should be aware of, or times when "method 2" might be advantageous over the first method?
UPDATE:
Let's say we are not using EntityFramework, this is still true so long as the underlying repository/data source implements IQueryable<T> right? And if it doesn't both these statements result in LINQ to Objects operations in memory?
Yes.
Yes.
Yes.
You are correct that calling ToList() forces linq-to-entities to evaluate and return the results as a list. As you suspect, this can have huge performance implications.
There are cases where linq-to-entities cannot figure out how to parse what looks like a perfectly simple query (like Where(x => SomeFunction(x))). In these cases you often have no choice but to call ToList() and operate on the collection in memory.
In response to your update:
ToList() always forces everything ahead of it to evaluate immediately, as opposed to deferred execution. Take this example:
someEnumerable.Take(10).ToList();
vs
someEnumerable.ToList().Take(10);
In the second example, any deferred work on someEnumerable must be executed before taking the first 10 elements. If someEnumerable is doing something labor intensive (like reading files from the disk using Directory.EnumerateFiles()), this could have very real performance implications.
Am I correct in thinking the first method is LINQ to Entities where EF builds a more specific SQL statement to pass on, putting the ordering effort on on the database?
Yes
Is the second method LINQ to Objects where LINQ drags the whole collection into memory ... before ordering thus leaving the burden on the server side ...?
Yes
But are there any other details/trade-offs I should be aware of, or times when "method 2" might be advantageous over the first method?
There will be many times where Method 1 is not possible - usually when you have a complex filter or sort order that can't be directly translated to SQL (or more appropriately where EF does not support a direct SQL translation). Also since you can't transmit lazy-loaded IQueryables over-the-wire, any time you have to serialize a result you're going to have to materialize it first with ToList() or something similar.
The other thing to be aware of is that IQueryable makes no guarantees on either (a) the semantic reasoning of the underlying provider, or (b) how much of the set of IQueryable methods are implemented by the provider.
For example: -
EF does not support Last().
Nor does it support time-part comparisons of DateTimes into valid T-SQL.
It doesn't support FirstOrDefault() in subqueries.
In such circumstances you need to bring data back to the client and then perform further evaluation client-side.
You also need to have an understanding of "how" it parses the LINQ pipeline to generate (in the case of EF) T-SQL. So you sometimes have to think carefully about how you construct your LINQ queries in order to generate effective T-SQL.
Having said all that, IQueryable<> is an extremely powerful tool in the .NET framework and well worth getting more familiar with.
Currently I am using LinqKit / Ms dynamic query example to dynamically build Linq Expressions from strings. This works fine.
LinqKit: http://www.albahari.com/nutshell/linqkit.aspx
Microsoft dynamic Linq queries: http://weblogs.asp.net/scottgu/archive/2008/01/07/dynamic-linq-part-1-using-the-linq-dynamic-query-library.aspx
Right now, I am migrating my application from C#3.5 to C#4.0. I am wondering if there is another way (standard way of the framework) to build queries from strings.
I have checked the documentation, but did not find anything yet. Also this is not an issue, since I have the above solution.
Only I'd prefer to use the "standard" features if there some. What's the best practice?
I'm currently doing something like this and I'm very happy with the result. The way I did it was with Entity Framework and the ObjectQuery.Select(string query, ObjectParameters[] params) method. More info here: http://msdn.microsoft.com/en-us/library/bb298787.aspx#Y586.
You won't be making expression from string but using SQL to Entities which does the work very well and was made exactly for that purpose as dynamically making Expression isn't trivial and is actually slower.
Cheers
Now that LINQ is such an integral part of .NET, are their optimizations at the compiler level that would use the optimal path to get results?
For example, imagine you had an array of integers and wanted to get the lowest value. You could do this without LINQ using a foreach, but it's certainly easier to use the Min function in LINQ. Once this goes to the compiler with LINQ would you have been better off to skip LINQ altogether or does it essentially just convert it to something similar to a foreach?
The C# compiler doesn't do much at all - it just calls the methods you tell it to, basically.
You could argue that removing unnecessary Select calls is an optimization:
from x in collection
where x.Condition
select x
is compiled to collection.Where(x => x.Condition) instead of collection.Where(x => x.Condition).Select(x => x) as the compiler recognises the identity transformation as being redundant. (A degenerate query of the form from x in collection select x is immune to this optimization, however, to allow LINQ providers to make sure that any query goes through at least one of their methods.)
The LINQ to Objects Min method just does a foreach, yes. Various LINQ to Objects methods do perform optimization. For example, Count() will check whether the data source implements ICollection or ICollection<T> and use the Count property if so. As madgnome points out in a comment, I wrote a bit more about this in a blog post a while ago.
Of course, other LINQ providers can perform their own optimizations.
I am currently using a CMS which uses an ORM with its own bespoke query language (i.e. with select/where/orderby like statements). I refer to this mini-language as a DSL, but I might have the terminology wrong.
We are writing controls for this CMS, but I would prefer not to couple the controls to the CMS, because we have some doubts about whether we want to continue with this CMS in the longer term.
We can decouple our controls from the CMS fairly easily, by using our own DAL/abstraction layer or what not.
Then I remembered that on most of the CMS controls, they provide a property (which is design-time editable) where users can type in a query to control what gets populated in the data source. Nice feature - the question is how can I abstract this feature?
It then occurred to me that maybe a DSL framework existed out there that could provide me with a simple query language that could be turned into a LINQ expression at runtime. Thus decoupling me from the CMS' query DSL.
Does such a thing exist? Am I wasting my time? (probably the latter)
Thanks
this isn't going to answer your question fully, but there is an extension for LINQ that allows you to specify predicates for LINQ queries as strings called Dynamic LINQ, so if you want to store the conditions in some string-based format, you could probably build your language on top of this. You'd still need to find a way to represent different clauses (where/orderby/etc.) but for the predicates passed as arguments to these, you could use Dynamic LINQ.
Note that Dynamic LINQ allows you to parse the string, but AFAIK doesn't have any way to turn existing Expression tree into that string... so there would be some work needed to do that.
(but I'm not sure if I fully understand the question, so maybe I'm totally of :-))