LINQ subquery question - linq

Can anybody tell me how I would get the records in the first statement that are not in the second statement (see below)?
from or in TblOrganisations
where or.OrgType == 2
select or.PkOrgID
Second query:
from o in TblOrganisations
join m in LuMetricSites
on o.PkOrgID equals m.FkSiteID
orderby m.SiteOrder
select o.PkOrgID

If you only need the IDs then Except should do the trick:
var inFirstButNotInSecond = first.Except(second);
Note that Except treats the two sequences as sets. This means that any duplicate elements in first won't be included in the results. I suspect that this won't be a problem since the name PkOrgID suggests a unique ID of some kind.
(See the documentation for Enumerable.Except and Queryable.Except for more info.)

Do you need the whole records, or just the IDs? The IDs are easy...
var ids = firstQuery.Except(secondQuery);
EDIT: Okay, if you can't do that, you'll need something like:
var secondQuery = ...; // As you've already got it
var query = from or in TblOrganisations
where or.OrgType == 2
where !secondQuery.Contains(or.PkOrgID)
select ...;
Check the SQL it produces, but I think it should do the right thing. Note that there's no point in performing any ordering in the second query - or even the join against TblOrganisations. In other words, you could use:
var query = from or in TblOrganisations
where or.OrgType == 2
where !LuMetricSites.Select(m => m.FkSiteID).Contains(or.PkOrgID)
select ...;

Use Except:
var filtered = first.Except(second);


Is it possible to break up an element itself before another join?

I have got two xml documents, simplified as
<num Operation="+/-">1</num>
<num Operation="+">3</num>
<num Operation="+/*">4</num>
I want to join NumSetA with NumSetB with the possible operations stated in the Operation tag, ie.
1+2, 1-2, 1+9, 1-9, 3+2, 3+9, 4+2, 4+9, 4*2, 4*9
by using string.split('/')
What I want to do is
var CrossJoin = SetA.Elements("num").join(this.attribute("Operation").value.split('/'),
Sorry for being inventive. Hope you understand what I am saying.
How can I achieve that?
It's pretty easy to do with the query syntax:
var crossJoin =
from numA in SetA.Elements("num")
from op in numA.Attribute("Operation").value.split('/')
from numB in SetB.Elements("num")
select new {
a = numA.value,
b = numB.value

Finding strings that are not in DB already

I have some bad performance issues in my application. One of the big operations is comparing strings.
I download a list of strings, approximately 1000 - 10000. These are all unique strings.
Then I need to check if these strings already exists in the database.
The linq query that I'm using looks like this:
IEnumerable<string> allNewStrings = DownloadAllStrings();
var selection = from a in allNewStrings
where !(from o in context.Items
select o.TheUniqueString).Contains(a)
select a;
Am I doing something wrong or how could I make this process faster preferably with Linq?
You did query the same unique strings 1000 - 10000 times for every element in allNewStrings, so it's extremely inefficient.
Try to query unique strings separately in order that it is executed once:
IEnumerable<string> allNewStrings = DownloadAllStrings();
var uniqueStrings = from o in context.Items
select o.TheUniqueString;
var selection = from a in allNewStrings
where !uniqueStrings.Contains(a)
select a;
Now you can see that the last query could be written using Except which is more efficient for the case of set operators like your example:
var selection = allNewStrings.Except(uniqueStrings);
An alternative solution would be to use a HashSet:
var set = new HashSet<string>(DownloadAllStrings());
set.ExceptWith(context.Items.Select(s => s.TheUniqueString));
The set will now contain the the strings that are not in the DB.

EF - Linq Expression and using a List of Ints to get best performance

So I have a list(table) of about 100k items and I want to retrieve all values that match a given list.
I have something like this.
the Table Sections key is NOT a primary key, so I'm expecting each value in listOfKeys to return a few rows.
List<int> listOfKeys = new List<int>(){1,3,44};
var allSections = Sections.Where(s => listOfKeys.Contains(;
I don't know if it makes a difference but generally listOfKeys will only have between 1 to 3 items.
I'm using the Entity Framework.
So my question is, is this the best / fastest way to include a list in a linq expression?
I'm assuming that it isn't better to use another .NETICollection data object. Should I be using a Union or something?
Suppose the listOfKeys will contain only small about of items and it's local list (not from database), like <50, then it's OK. The query generated will be basically WHERE id in (...) or WHERE id = ... OR id = ... ... and that's OK for database engine to handle it.
A Join would probably be more efficient:
var allSections =
from s in Sections
join k in listOfKeys on equals k
select s;
Or, if you prefer the extension method syntax:
var allSections = Sections.Join(listOfKeys, s =>, k => k, (s, k) => s);

How can I merge two outputs of two Linq queries?

I'm trying to merge these two object but not totally sure how.. Can you help me merge these two result objects?
// Create Linq Query for all segments in "CognosSecurity"
var userListAuthoritative = (from c in ctx.CognosSecurities
where (c.SecurityType == 1 || c.SecurityType == 2)
select new {c.SecurityType, c.LoginName , c.SecurityName}).Distinct();
// Create Linq Query for all segments in "CognosSecurity"
var userListAuthoritative3 = (from c in ctx.CognosSecurities
where c.SecurityType == 3 || c.SecurityType == 0
select new {c.SecurityType , c.LoginName }).Distinct();
I think I see where to go with this... but to answer the question the types of the objects are int, string, string for SecurityType, LoginName , and SecurityName respectively
If you're wondering why I have them broken like this is because I want to ignore one column when doing a distinct. Here are the SQL queries that I'm converting to SQL.
select distinct SecurityType, LoginName, 'Segment'+'-'+SecurityName
FROM [NFPDW].[dbo].[CognosSecurity]
where SecurityType =1
select distinct SecurityType, LoginName, 'Business Line'+'-'+SecurityName
FROM [NFPDW].[dbo].[CognosSecurity]
where SecurityType =2
select distinct SecurityType, LoginName, SecurityName
FROM [NFPDW].[dbo].[CognosSecurity]
where SecurityType in (1,2)
You can't join these because the types are different (first has 3 properties in the resulting type, second has two).
If you can tolerate putting a null value in for the 3rd result of the second query this will help. I would then suggest you just do a userListAuthoritative.concat(userListAuthoritative3 ) BUT I think this will not work as the anonymous types generated by the linq will not be of the same class, even tho the structure is the same. To solve that you can either define a CustomType to encapsulate the tuple and do select new CustomType{ ... } in both queries or postprocess the results using select() in a similar fashion.
Acutally the latter select() approach will also allow you to solve the parameter count mismatch by implementing the select with a null in the post-process to CustomType.
EDIT: According to the comment below once the structures are the same the anonymous types will be the same.
I assume that you want to keep the results distinct:
var merged = userListAuthoritative.Concat(userListAuthoritative3).Distinct();
And, as Mike Q pointed out, you need to make sure that your types match, either by giving the anonymous types the same signature, or by creating your own POCO class specifically for this purpose.
If I understand your edit, you want your Distinct to ignore the SecurityName column. Is that correct?
var userListAuthoritative = from c in ctx.CognosSecurities
where new[]{0,1,2,3}.Contains(c.SecurityType)
group new {c.SecurityType, c.LoginName, c.SecurityName}
by new {c.SecurityType, c.LoginName}
select g.FirstOrDefault();
I'm not exactly sure what you mean by merge, since you're returning different (anonymous) types from each one. Is there a reason the following doesn't work for you?
var userListAuthoritative = (from c in ctx.CognosSecurities
where (c.SecurityType == 1 || c.SecurityType == 2 || c.SecurityType == 3 || c.SecurityType == 0)
select new {c.SecurityType, c.LoginName , c.SecurityName}).Distinct();
Edit: This assumed they were of the same type -- but they're not.
Try below code, you might need to implement IEqualityComparer<T> in your ctx type.
var merged = userListAuthoritative.Union(userListAuthoritative3);

Stuck on a subquery that is grouping, in Linq`

I have some Linq code and it's working fine. It's a query that has a subquery in the Where clause. This subquery is doing a groupby. Works great.
The problem is that I don't know how to grab one of the results from the subquery out of the subquery into the parent.
Frst, here's the code. After that, I'll expplain what piece of data i'm wanting to extract.
var results = (from a in db.tblProducts
where (from r in db.tblReviews
where r.IdUserModified == 1
group r by
into productGroup
orderby productGroup.Count() descending
ReviewCount = productGroup.Count()
r =>
r.IdProductCode_Alpha== a.IdProductCode_Alpha&&
r.IdProductCode_Beta== a.IdProductCode_Beta&&
r.IdProductCode_Gamma== a.IdProductCode_Gamma)
where a.ProductFirstName == ""
select new {a.IdProduct, a.FullName}).ToList();
Ok. I've changed some field and tables names to protect the innocent. :)
See this last line :-
select new {a.IdProduct, a.FullName}).ToList();
I wish to include in that the ReviewCount (from the subquery). I'm jus not sure how.
To help understand the problem, this is what the data looks like.
Sub Query
IdProductCode_Alpha = 1, IdProductCode_Beta = 2, IdProductCode_Gamma = 3, ReviewCount = 10
... row 2 ...
... row 3 ...
Parent Query
IdProduct = 69, FullName = 'Jon Skeet's Wonder Balm'
So the subquery grabs the actual data i need. The parent query determines the correct product, based on the subquery filters.
EDIT 1: Schema
tblReviews (each product has zero to many reviews)
IdProductCode_Alpha (can be null)
IdProductCode_Beta (can be null)
IdProductCode_Gamma (can be null)
So i'm trying to find the top 3 products a person has done reviews on.
The linq works perfectly... except i just don't know how to include the COUNT in the parent query (ie. pull that result from the subquery).
Cheers :)
Got it myself. Take note of the double from at the start of the query, then the Any() being replaced by a Where() clause.
var results = (from a in db.tblProducts
from g in (
from r in db.tblReviews
where r.IdUserModified == 1
group r by
into productGroup
orderby productGroup.Count() descending
ReviewCount = productGroup.Count()
Where(g.IdProductCode_Alpha== a.IdProductCode_Alpha&&
g.IdProductCode_Beta== a.IdProductCode_Beta&&
g.IdProductCode_Gamma== a.IdProductCode_Gamma)
where a.ProductFirstName == ""
select new {a.IdProduct, a.FullName, g.ReviewCount}).ToList();
While I don't understand LINQ completely, but wouldn't the JOIN work?
I know my answer doesn't help but it looks like you need a JOIN with the inner table(?).
I agree with shahkalpesh, both about the schema and the join.
You should be able to refactor...
r => r.IdProductCode_Alpha == a.IdProductCode_Alpha &&
r.IdProductCode_Beta == a.IdProductCode_Beta &&
r.IdProductCode_Gamma == a.IdProductCode_Gamma
into an inner join with tblProducts.
