Linq-to-sql Not Contains or Not in? - linq

I'm building a poll widget. I've 2 tables, call them Polls and PollsCompleted. I need to do a linq query to get all the Polls that do not exist for a given user in PollsCompleted.
I have the following sets:
For Polls
Where Active == True
For PollsCompleted
Where UserId == ThisUserId
Where PollId = Polls.Id
Now I need to get all Polls that do not exist in PollsCompleted. I need an example for this using either a single or multiple queries. I've tried to break it down into 2 queries.
Basically, I've 2 IQueryables of type T and T1. I want to take all T's where T.ID does not exist in T1.ParentId.

T.Where(x => ! T1.Select(y => y.ParentID).Contains(x.ID))
In Linq you often work from the bottom up. Here we first get a collection of all the parentIDs in T1 -- the T1.Select(...) part. Then we create a where clause that selects all of the Ts whose IDs are not contained in that set.
Note that the result is a query. To materialize it, use ToList() or similar on the statement above.

Use Except. That will work in this case.
For your reference Enumerable.Except Method

Related

Transform select statement

How can i dynamically transform an SQL-Query?
I know there is a Select.getSelect(), but how can i add fields in the select-query?
Use-case: for a Rest-Query i have a lot of paginated resources and i have an abstraction to create the paginated-query. It takes the SelectConditionStep and adds the rest, depending on additional parameters. It works really well for simple queries, but for queries containing joins a little bit of transformation of the query would required. (Mainly because i can't naively limit the number results, since the join can be a one to many relationship)
The easiest way is to keep a List<Field<?>> where you add the fields for your select() clause, and then create the Select statement only when you actually execute it, instead of passing a Select object around. Example:
List<Field<?>> fields = new ArrayList<>();
// Just some examples:
fields.addAll(getDefaultFields());
fields.addAll(getFieldsFromUI());
fields.addAll(getCalculatedFields());
// Much later on, you finally create the statement:
DSL.using(configuration)
.select(fields)
.from(...)
.fetch();

Clean way to write this query

I'm looking for a clean way to write this Linq query.
Basically I have a collection of objects with id's, then using nhibernate and Linq, I need to check if the nhibernate entity has a subclass collection where all id's in object collection exist in the nhibernate subclass collection.
If there was just one item this would work:
var objectImCheckingAgainst = ... //irrelevant
where Obj.SubObj.Any(a => a.id == objectImCheckingAgainst.Id)
Now I want to instead somehow pass a list of objectImCheckingAgainst and return true only if the Obj.SubObj collection contains all items in list of objectImCheckingAgainst based on Id.
I like to use GroupJoin for this.
return objectImCheckingAgainst.GroupJoin(Obj.SubObj,
a => a.Id,
b => b.id,
(a, b) => b.Any())
.All(c => c);
I believe this query should be more or less self-explanatory, but essentially, this joins the two collections using their respective ids as keys, then groups those results. Then for each of those groupings, it determines whether any matches exist. Finally, it ensures that all groupings had matches.
A useful alternative that I sometimes use is .Count() == 1 instead of the .Any(). Obviously, the difference there is whether you want to support multiple elements with the same id matching. From your description, it sounded like that either doesn't matter or is enforced by another means. But that's an easy swap, either way.
An important concept in GroupJoin that I know is relevant, but may or may not be obvious, is that the first enumerable (which is to say, the first argument to the extension method, or objectImCheckingAgainst in this example) will have all its elements included in the result, but the second one may or may not. It's not like Join, where the ordering is irrelevant. If you're used to SQL, these are the elementary beginnings of a LEFT OUTER JOIN.
Another way you could accomplish this, somewhat more simply but not as efficiently, would be to simply nest the queries:
return objectImCheckingAgainst.All(c => Obj.SubObj.Any(x => x.id == c.Id));
I say this because it's pretty similar to the example you provided.
I don't have any experience with NHibernate, but I know many ORMs (I believe EF included) will map this to SQL, so efficiency may or may not be a concern. But in general, I like to write LINQ as close to par as I can so it works as well in memory as against a database, so I'd go with the first one I mentioned.
I'm not well versed in LINQ-to-NHibernate but when using LINQ against any SQL backen it's always important to keep an eye on the generated SQL. I think this where clause...
where Obj.SubObj.All(a => idList.Contains(a.id))
...will produce the best SQL (having an IN statement).
idList is a list of Ids extracted from the list of objectImCheckingAgainst objects.

Concatenating a LINQ query and LINQ sort statement

I'm having a problem joining two LINQ queries.
Currently, my (original) code looks like this
s.AnimalTypes.Sort((x, y) => string.Compare(x.Type, y.Type));
What I'm needing to do is change this based on a date, then select all data past that date, so I have
s.AnimalTypes.Select(t=>t.DateChanged > dateIn).ToList()
s.AnimalTypes.Sort((…
This doesn't look right as it's not sorting the data selected, rather sorting everything in s.AnimalTypes.
Is there a way to concatenate the two LINQ lines? I've tried
s.AnimalTypes.Select(t=>t.DateChanged > dateIn).ToList().Sort((…
but this gives me an error on the Sort section.
Is there a simple way to do this? I've looked around and Grouo and OrderBy keep cropping up, but I'm not sure these are what I need here.
Thanks
From your description, I believe you want something like:
var results = s.AnimalTypes.Where(t => t.DateChanged > dateIn).OrderBy(t => t.Type);
You can call ToList() to convert to a List<T> at the end if required.
There are a couple of fundamental concepts I believe you are missing here -
First, unlike List<T>.Sort, the LINQ extension methods don't change the original collections, but rather return a new IEnumerable<T> with the filtered or sorted results. This means you always need to assign something to the return value (hence my var results = above).
Second, Select performs a mapping operation - transforming the data from one form to another. For example, you could use it to extract out the DateChanged (Select(t => t.DateChanged)), but this would give you an enumeration of dates, not the original animal types. In order to filter or restrict the list with a predicate (criteria), you'd use Where instead.
Finally, you can use OrderBy to reorder the resulting enumerable.
You are using Select when you actually want to use Where.
Select is a projection from one a collection of one type into another type - you won't increase or reduce the number of items in a collection using Select, but you can instead select each object's name or some other property.
Where is what you would use to filter a collection based on a boolean predicate.

Trying to execute a WHERE IN: Invalid 'where' condition. An entity member is invoking an invalid property or method

I'm trying to get a list of cases whose AccountID is found in a previous list.
The error occurs on the last line of the following:
// Gets the list of permissions for the current contact
var perms = ServiceContext.GetCaseAccessByContact(Contact).Cast<Adx_caseaccess>();
// Get the list of account IDs from the permissions list
var customerIDs = perms.Select(p => p.adx_accountid).Distinct();
// Get the list of cases that belong to any account whose ID is in the `customerID` list
var openCases = (from c in ServiceContext.IncidentSet where customerIDs.Contains(c.AccountId) select c).ToList();
I'm not sure what the "invalid property" is the error is talking about. The code compiles, I just get the error at runtime.
The problem is the CRM Linq Provider. It doesn't support all of the available options that the Linq-to-objects provider offers. In this case, the CRM does not support the Enumerable.Contains() method.
where:
The left side of the clause must be an attribute name and the
right side of the clause must be a value. You cannot set the left side
to a constant. Both the sides of the clause cannot be constants.
Supports the String functions Contains, StartsWith, EndsWith, and
Equals.
You can work around this in one of two ways:
Rework your query to use a more natural join.
If a join is not possible, you can use Dynamic Linq to generate a list of OR clauses on each item in customerIDs. This would function similarly to Enumerable.Contains.
See my answer or the accepted answer to the question "How to get all the birthdays of today?" for two separate ways to accomplish this.

How do I sort, group a query properly that returns a tuple of an orm object and a custom column?

I am looking for a way to have a query that returns a tuple first sorted by a column, then grouped by another (in that order). Simply .sort_by().group_by() didn't appear to work. Now I tried the following, which made the return value go wrong (I just got the orm object, not the initial tuple), but read for yourself in detail:
Base scenario:
There is a query which queries for test orm objects linked from the test3 table through foreign keys.
This query also returns a column named linked that either contains true or false. It is originally ungrouped.
my_query = session.query(test_orm_object)
... lots of stuff like joining various things ...
add_column(..condition that either puts 'true' or 'false' into the column..)
So the original return value is a tuple (the orm object, and additionally the true/false column).
Now this query should be grouped for the test orm objects (so the test.id column), but before that, sorted by the linked column so entries with true are preferred during the grouping.
Assuming the current unsorted, ungrouped query is stored in my_query, my approach to achieve this was this:
# Get a sorted subquery
tmpquery = my_query.order_by(desc('linked')).subquery()
# Read the column out of the sub query
my_query = session.query(tmpquery).add_columns(getattr(tmpquery.c,'linked').label('linked'))
my_query = my_query.group_by(getattr(tmpquery.c, 'id')) # Group objects
The resulting SQL query when running this is (it looks fine to me btw - the subquery 'anon_1' is inside itself properly sorted, then fetched and its id aswell as the 'linked' column is extracted (amongst a few other columns SQLAlchemy wants to have apparently), and the result is properly grouped):
SELECT anon_1.id AS anon_1_id, anon_1.name AS anon_1_name, anon_1.fk_test3 AS anon_1_fk_test3, anon_1.linked AS anon_1_linked, anon_1.linked AS linked
FROM (
SELECT test.id AS id, test.name AS name, test.fk_test3 AS fk_test3, CASE WHEN (anon_2.id = 87799534) THEN 'true' ELSE 'false' END AS linked
FROM test LEFT OUTER JOIN (SELECT test3.id AS id, test3.fk_testvalue AS fk_testvalue
FROM test3)
AS anon_2 ON anon_2.fk_testvalue = test.id ORDER BY linked DESC
)
AS anon_1 GROUP BY anon_1.id
I tested it in phpmyadmin, where it gave me, as expected, the id column (for the orm object id), then the additional columns SQL_Alchemy seems to want there, and the linked column. So far, so good.
Now my expected return values would be, as they were from the original unsorted, ungrouped query:
A tuple: 'test' orm object (anon_1.id column), 'true'/'false' value (linked column)
The actual return value of the new sorted/grouped query is however (the original query DOES indeed return a touple before the code above is applied):
'test' orm object only
Why is that so and how can I fix it?
Excuse me if that approach turns out to be somewhat flawed.
What I actually want is, have the original query simply sorted, then grouped without touching the return values. As you can see above, my attempt was to 'restore' the additional return value again, but that didn't work. What should I do instead, if this approach is fundamentally wrong?
Explanation for the subquery use:
The point of the whole subquery is to force SQLAlchemy to execute this query separately as a first step.
I want to order the results first, and then group the ordered results. That seems to be hard to do properly in one step (when trying manually with SQL I had issues combining order and group by in one step as I wanted).
Therefore I don't simply order, group, but I order first, then subquery it to enforce that the order step is actually completed first, and then I group it.
Judging from manual PHPMyAdmin tests with the generated SQL, this seems to work fine. The actual problem is that the original query (which is now wrapped as the subquery you were confused about) had an added column, and now by wrapping it up as a subquery, that column is gone from the overall result. And my attempt to readd it to the outer wrapping failed.
It would be much better if you provided examples. I don't know if these columns are in separate tables or what not. Just looking at your first paragraph, I would do something like this:
a = session.query(Table1, Table2.column).\
join(Table2, Table1.foreign_key == Table2.id).\
filter(...).group_by(Table2.id).order_by(Table1.property.desc()).all()
I don't know exactly what you're trying to do since I need to look at your actual model, but it should look something like this with maybe the tables/objs flipped around or more filters.

Resources