LINQ syntax - ordering of criteria - linq

I'm trying to understand LINQ syntax and getting stuck. So I've got this line which gets all of the people with the postcode I'm searching for
IQueryable<int> PersonIDsWithThisPostcode = _context.Addresses
.Where(pst => pst.Postcode.Contains(p))
.Select(b => b.PersonID);
This line then only returns people in PersonIDsWithThisPostcode
persons = persons.Where(ps => PersonIDsWithThisPostcode.Contains(ps.PersonID));
I'd have expected it to be something along the lines of this, where you're looking at a container, then checking against a subset of values to see what you want.
persons = persons.Where(ps => ps.PersonID.Contains(PersonIDsWithThisPostcode));
So from a SQL point-of-view I'd think of it something like this
bucket = bucket.Where(bucket.Contains(listoffish));
but it seems to act like this
bucket = bucket.Where(listoffish.Contains(bucket));
I've read through lots of documentation but I can't get my head around this apparently simple notion. Any help to explain this way of thinking would be appreciated.
Thanks

If PersonID is an int you can't use ps.PersonID.Contains because an int is not a collection (or string which would search a substring).
The only correct way is to search your PersonId in a collection which is the PersonIDsWithThisPostcode-query that returns all matching PersonIds.
A single PersonID doesn't contain a collection but a collection of PersonIds contains a single PersonId.
So this is correct, it returns all persons which PersonId is in the other sequence:
persons = persons.Where(ps => PersonIDsWithThisPostcode.Contains(ps.PersonID));
and this not:
persons = persons.Where(ps => ps.PersonID.Contains(PersonIDsWithThisPostcode));

The syntax is reversed in comparison to SQL, which should come as no surprise, considering that C# and SQL are two different languages.
In SQL you place the list on the right, because IN operator reads "item in collection"
WHERE someId IN (100, 102, 113, 200, 219)
In C#, without regard to LINQ, you check if a collection contains an item using code that reads "collection contains item"
myList.Contains(someId);
When you use Contains in LINQ that gets translated to SQL, LINQ provider translates one syntax to the other syntax to shield C# programmers from thinking about the differences.

Related

Get all the includes from an Entity Framework Query?

I've the following Entity Model : Employee has a Company and a Company has Employees.
When using the Include statement like below:
var query = context.Employees.Include(e => e.Company);
query.Dump();
All related data is retrieved from the database correctly. (Using LEFT OUTER JOIN on Company table)
The problem is hat when I use the GroupBy() from System.Linq.Dynamic to group by Company.Name, the Employees are missing the Company data because the Include is lost.
Example:
var groupByQuery = query.GroupBy("new (Company.Name as CompanyName)", "it");
groupByQuery.Dump();
Is there a way to easily retrieve the applied Includes on the 'query' as a string collection, so that I can include them in the dynamic GroupBy like this:
var groupByQuery2 = query.GroupBy("new (Company, Company.Name as CompanyName)", "it");
groupByQuery2.Dump();
I thought about using the ToString() functionality to get the SQL Command like this:
string sql = query.ToString();
And then use RegEx to extract all LEFT OUTER JOINS, but probably there is a better solution ?
if you're creating the query in the first place - I'd always opt to save the includes (and add to them if you're making a composite query/filtering).
e.g. instead of returning just 'query' return new QueryContext {Query = query, Includes = ...}
I'd like to see a more elegant solution - but I think that's your best bet.
Otherwise you're looking at expression trees, visitors and all those nice things.
SQL parsing isn't that straight either - as queries are not always that simple (often a combo of things etc.).
e.g. there is a `span' inside the query object (if you traverse a bit) which seems to be holding the 'Includes' but it's not much help.

LINQ Query to find all tags?

I have an application that manages documents called Notes. Like a blog, Notes can be searched for matches against one or more Tags, which are contained in a Note.Tags collection property. A Tag has Name and ID properties, and matches are made against the ID. A user can specify multiple tags to match against, in which case a Note must contain all Tags specified to match.
I have a very complex LINQ query to perform a Note search, with extension methods and looping. Quite frankly, it has a real code smell to it. I want to rewrite the query with something much simpler. I know that if I made the Tag a simple string, I could use something like this:
var matchingNotes = from n in myNotes
where n.Tags.All(tag => searchTags.Contains(tag))
Can I do something that simple if my model uses a Tag object with an ID? What would the query look like. Could it be written in fluent syntax? what would that look like?
I believe you can find notes that have the relevant tags in a single LINQ expression:
IQueryable<Note> query = ... // top part of query
query = query.Where(note => searchTags.All(st =>
note.Tags.Any(notetag => notetag.Id == st.Id)));
Unfortunately there is no “fluent syntax” equivalent for All and Any, so the best you can do there is
query = from note in query
where searchTags.All(st =>
note.Tags.Any(notetag => notetag.Id == st.Id))
select note;
which is not that much better either.
For starters see my comment; I suspect the query is wrong anyway! I would simplifiy it, by simply enforcing separately that each tag exists:
IQueryable<Note> query = ... // top part of query
foreach(var tagId in searchTagIds) {
var tmpId = tagId; // modified closures...
query = query.Where(note => note.Tags.Any(t => t.Id == tmpId));
}
This should have the net effect of enforcing all the tags specified are present and accounted for.
Timwi's solution works in most dialects of LINQ, but not in Linq to Entities. I did find a single-statement LINQ query that works, courtesy of ReSharper. Basically, I wrote a foreach block to do the search, and ReSharper offered to convert the block to a LINQ statement--I had no idea it could do this.
I let ReSharper perform the conversion, and here is what it gave me:
return searchTags.Aggregate<Tag, IQueryable<Note>>(DataStore.ObjectContext.Notes, (current, tag) => current.Where(n => n.Tags.Any(t => t.Id == tag.Id)).OrderBy(n => n.Title));
I read my Notes collection from a database, using Entity Framework 4. DataStore is the custom class I use to manage my EF4 connection; it holds the EF4 ObjectContext as a property.

Linq Expression Syntax - How to make it more readable?

I am in the process of writing something that will use Linq to combine results from my database, via Linq2Sql and an in-memory list of objects in order to find out which of my in-memory objects match something on the database.
I've come up with the query in both expression and query syntax.
Expression Syntax
var query = order.Items.Join(productNonCriticalityList,
i => i.ProductID,
p => p.ProductID,
(i, p) => i);
Query Syntax
var query =
from p in productNonCriticalityList
join i in order.Items
on p.ProductID equals i.ProductID
select i;
I realise that we have all the code completion goodness with expression syntax, and I do actually use that more. Mainly because it's easier to create re-usable chunks of filter code that can be chained together to form more complex filters.
But for a join the latter seems far more readable to me, but maybe that is because I am used to writing T-SQL.
So, am I missing a trick or is it just a matter of getting used to it?
I agree with the other responders that the exact question you're asking is simply a matter of preference. Personaly, I mix the two forms depending upon which is clearer for the specific query that I'm writing.
If I have one comment though, I would say that the query looks like it might load all of the items from the order. That might be fine for a single order one time, but if you're looping through lots of orders, it might be more efficient to load all of the items for all of the in one go (you might want to additionally filter by date or customer, or whatever though). If you do that, you might get better results by switching the query around:
var productIds = (from p in productNonCriticalityList
orderby p.productID
select p.ProductID).Distinct();
var orderItems = from i in dc.OrderItems
where productIds.Contains(i.ProductID)
&& // Additional filtering here.
select i;
It's a bit backwards at first glance, but it could save you from loading in all the order items and also from sending lots of queries. It works because the where productIds.Contains(...) call can be converted to where i.ProductID in (1, 2, 3, 4, 5) in SQL. Of course, you'd have to judge it based on the expected number of order items, and the number of product IDs.
It really all comes down to preference. Some people just hate the idea of query like syntax in their code. I for one appreciate the query syntax, it is declarative and quite readable. Like you said though, the chainability of the first example is a nice thing to have. I guess for my money I would keep it query until I felt I needed to begin chaining the call.
I used to feel the same way. Now I find query syntax easier to read and write, particularly when things get complicated. As much as it irked me to type it the first time, 'let' does wonderful things in ways that would not be readable in Expression Syntax.
I prefer the Query syntax when its complex and Expression syntax when its a simple query.
If a DBA were to read the C# code to see what SQL we are using, they would understand and digest the Query syntax easier.
Taking a simple example:
Query
var col = from o in orders
orderby o.Cost ascending
select o;
Expression
var col2 = orders.OrderBy(o => o.Cost);
To me, the Expression syntax is an easier choice to understand here.
Another example:
Query
var col9 = from o in orders
orderby o.CustomerID, o.Cost descending
select o;
Expression
var col6 = orders.OrderBy(o => o.CustomerID).
ThenByDescending(o => o.Cost);
Both are easy to read and understand, however if the query was
//returns same results as above
var col5 = from o in orders
orderby o.Cost descending
orderby o.CustomerID
select o;
//NOTE the ordering of the orderby's
That looks a little confusing to be as the fields are in a different order and it appears a little backwards.
For Joins
Query
var col = from c in customers
join o in orders on
c.CustomerID equals o.CustomerID
select new
{
c.CustomerID,
c.Name,
o.OrderID,
o.Cost
};
Expression:
var col2 = customers.Join(orders,
c => c.CustomerID,o => o.CustomerID,
(c, o) => new
{
c.CustomerID,
c.Name,
o.OrderID,
o.Cost
}
);
I find that Query is better.
My summary would be use whatever looks easiest and fastest to understand given the query at hand. There is no golden rule of which to use. However, if there are a lot of joins, I'd go with Query syntax.
Well, both statements are equivalent. So you could youse them both, depending on the surrounging code and what is more readable. In my project I make the decision which syntax to use dependent on those two conditions.
Personally I would write the expression syntax in one line, but this is a matter of taste.

LINQ syntax where string value is not null or empty

I'm trying to do a query like so...
query.Where(x => !string.IsNullOrEmpty(x.PropertyName));
but it fails...
so for now I have implemented the following, which works...
query.Where(x => (x.PropertyName ?? string.Empty) != string.Empty);
is there a better (more native?) way that LINQ handles this?
EDIT
apologize! didn't include the provider... This is using LINQ to SQL
http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=367077
Problem Statement
It's possible to write LINQ to SQL that gets all rows that have either null or an empty string in a given field, but it's not possible to use string.IsNullOrEmpty to do it, even though many other string methods map to LINQ to SQL.
Proposed Solution
Allow string.IsNullOrEmpty in a LINQ to SQL where clause so that these two queries have the same result:
var fieldNullOrEmpty =
from item in db.SomeTable
where item.SomeField == null || item.SomeField.Equals(string.Empty)
select item;
var fieldNullOrEmpty2 =
from item in db.SomeTable
where string.IsNullOrEmpty(item.SomeField)
select item;
Other Reading:
1. DevArt
2. Dervalp.com
3. StackOverflow Post
This won't fail on Linq2Objects, but it will fail for Linq2SQL, so I am assuming that you are talking about the SQL provider or something similar.
The reason has to do with the way that the SQL provider handles your lambda expression. It doesn't take it as a function Func<P,T>, but an expression Expression<Func<P,T>>. It takes that expression tree and translates it so an actual SQL statement, which it sends off to the server.
The translator knows how to handle basic operators, but it doesn't know how to handle methods on objects. It doesn't know that IsNullOrEmpty(x) translates to return x == null || x == string.empty. That has to be done explicitly for the translation to SQL to take place.
This will work fine with Linq to Objects. However, some LINQ providers have difficulty running CLR methods as part of the query. This is expecially true of some database providers.
The problem is that the DB providers try to move and compile the LINQ query as a database query, to prevent pulling all of the objects across the wire. This is a good thing, but does occasionally restrict the flexibility in your predicates.
Unfortunately, without checking the provider documentation, it's difficult to always know exactly what will or will not be supported directly in the provider. It looks like your provider allows comparisons, but not the string check. I'd guess that, in your case, this is probably about as good of an approach as you can get. (It's really not that different from the IsNullOrEmpty check, other than creating the "string.Empty" instance for comparison, but that's minor.)
... 12 years ago :) But still, some one may found it helpful:
Often it is good to check white spaces too
query.Where(x => !string.IsNullOrWhiteSpace(x.PropertyName));
it will converted to sql as:
WHERE [x].[PropertyName] IS NOT NULL AND ((LTRIM(RTRIM([x].[PropertyName])) <> N'') OR [x].[PropertyName] IS NULL)
or other way:
query.Where(x => string.Compare(x.PropertyName," ") > 0);
will be converted to sql as:
WHERE [x].[PropertyName] > N' '
If you want to go change the type of the collection from nullable type IEnumerable<T?> to non-null type IEnumerable<T> you can use .OfType<T>().
.OfType<T>() will remove null values and return a list of the type T.
Example: If you have a list of nullable strings: List<string?> you can change the type of the list to string by using OfType<string() as in the below example:
List<string?> nullableStrings = new List<string?> { "test1", null, "test2" };
List<string> strings = nullableStrings.OfType<string>().ToList();
// strings now only contains { "test1", "test2" }
This will result in a list of strings only containing test1 and test2.

Entity Framework - "Unable to create a constant value of type 'Closure type'..." error

Why do I get the error:
Unable to create a constant value of type 'Closure type'. Only
primitive types (for instance Int32, String and Guid) are supported in
this context.
When I try to enumerate the following Linq query?
IEnumerable<string> searchList = GetSearchList();
using (HREntities entities = new HREntities())
{
var myList = from person in entities.vSearchPeople
where upperSearchList.All( (person.FirstName + person.LastName) .Contains).ToList();
}
Update:
If I try the following just to try to isolate the problem, I get the same error:
where upperSearchList.All(arg => arg == arg)
So it looks like the problem is with the All method, right? Any suggestions?
It looks like you're trying to do the equivalent of a "WHERE...IN" condition. Check out How to write 'WHERE IN' style queries using LINQ to Entities for an example of how to do that type of query with LINQ to Entities.
Also, I think the error message is particularly unhelpful in this case because .Contains is not followed by parentheses, which causes the compiler to recognize the whole predicate as a lambda expression.
I've spent the last 6 months battling this limitation with EF 3.5 and while I'm not the smartest person in the world, I'm pretty sure I have something useful to offer on this topic.
The SQL generated by growing a 50 mile high tree of "OR style" expressions will result in a poor query execution plan. I'm dealing with a few million rows and the impact is substantial.
There is a little hack I found to do a SQL 'in' that helps if you are just looking for a bunch of entities by id:
private IEnumerable<Entity1> getByIds(IEnumerable<int> ids)
{
string idList = string.Join(",", ids.ToList().ConvertAll<string>(id => id.ToString()).ToArray());
return dbContext.Entity1.Where("it.pkIDColumn IN {" + idList + "}");
}
where pkIDColumn is your primary key id column name of your Entity1 table.
BUT KEEP READING!
This is fine, but it requires that I already have the ids of what I need to find. Sometimes I just want my expressions to reach into other relations and what I do have is criteria for those connected relations.
If I had more time I would try to represent this visually, but I don't so just study this sentence a moment: Consider a schema with a Person, GovernmentId, and GovernmentIdType tables. Andrew Tappert (Person) has two id cards (GovernmentId), one from Oregon (GovernmentIdType) and one from Washington (GovernmentIdType).
Now generate an edmx from it.
Now imagine you want to find all the people having a certain ID value, say 1234567.
This can be accomplished with a single database hit with this:
dbContext context = new dbContext();
string idValue = "1234567";
Expression<Func<Person,bool>> expr =
person => person.GovernmentID.Any(gid => gid.gi_value.Contains(idValue));
IEnumerable<Person> people = context.Person.AsQueryable().Where(expr);
Do you see the subquery here? The generated sql will use 'joins' instead of sub-queries, but the effect is the same. These days SQL server optimizes subqueries into joins under the covers anyway, but anyway...
The key to this working is the .Any inside the expression.
I have found the cause of the error (I am using Framework 4.5). The problem is, that EF a complex type, that is passed in the "Contains"-parameter, can not translate into an SQL query. EF can use in a SQL query only simple types such as int, string...
this.GetAll().Where(p => !assignedFunctions.Contains(p))
GetAll provides a list of objects with a complex type (for example: "Function"). So therefore, I would try here to receive an instance of this complex type in my SQL query, which naturally can not work!
If I can extract from my list, parameters which are suited to my search, I can use:
var idList = assignedFunctions.Select(f => f.FunctionId);
this.GetAll().Where(p => !idList.Contains(p.FunktionId))
Now EF no longer has the complex type "Function" to work, but eg with a simple type (long). And that works fine!
I got this error message when my array object used in the .All function is null
After I initialized the array object, (upperSearchList in your case), the error is gone
The error message was misleading in this case
where upperSearchList.All(arg => person.someproperty.StartsWith(arg)))

Resources