How to compare two tables in linq to sql? - linq

I am trying to compare two tables (i.e values, count, etc..) in linq to sql but I am not getting the way to achieve it. I tried the following,
Table1.Any(i => i.itemNo == Table2.itemNo)
It gives error. Could you please help me?
Thanks in Advance.

how about
var isDifferent =
Table1.Zip(Table2, (j, k) => j.itemNo == k.itemMo).Any(m => !m);
EDIT
if Linq-To-Sql does not support Zip.
var one = Table1.ToList();
var two = Table2.ToList();
var isDifferent =
one.Zip(two, (j, k) => j.itemNo == k.itemMo).Any(m => !m);
if the tables are vary large this could cause performance problems. In that case you will need a much more sophisticated solution, if so, please ask.
EDIT2
If the tables are very large you don't want to get all the data from the server and hold it memory. Additionaly, Linq and SQL server do not garauntee the order of the rows unless you specify an order in the query. This becomes espcially relavent for large result sets returned by a multi processor server where the effects of parallelism are likely to come into play.
I suggest that Linq-to-Sql doesen't really cater well for your scenario so you will have to help it out using ExecuteQuery somthing like this.
string zipQuery =
#"SELECT TOP 1
1
FROM
[Table1] [one]
WHERE
NOT EXISTS (
SELECT * FROM [Table2] [two] WHERE [two].[itemNo] = [one].[itemNo]
)
UNION ALL
SELECT
1
FROM
[Table2] [two]
WHERE
NOT EXISTS (
SELECT * FROM [Table1] [one] WHERE [one].[itemNo] = [two].[itemNo]
)
UNION ALL
SELECT 0";
var isDifferent = context.ExecuteQuery<int>(zipQuery).Single() == 1;
This will do the select on the server without returning lots of data to the client but, I think you will agree is much more complicated.
EDIT3
Okay, the zip approach should be fine for 1000 rows. I've read your comment and I suggest changing the code accordingly.
var one = Table1.ToList();
var two = Table2.ToList();
var isDifferent =
one.Count != two.Count ||
one.Zip(two, (o, t) => o.itemNo == k.itemNo).Any(m => !m);
You should probably consider putting an order by on the list retrievers, like this.
var one = Table1.OrderBy(o => o.itemNo).ToList();
Strictly, the results of a Linq-to-Sql come back in any order unless an order is specified.

Related

Load only some elements of a nested collection efficiently with LINQ

I have the following LINQ query (using EF Core 6 and MS SQL Server):
var resultSet = dbContext.Systems
.Include(system => system.Project)
.Include(system => system.Template.Type)
.Select(system => new
{
System = system,
TemplateText = system.Template.TemplateTexts.FirstOrDefault(templateText => templateText.Language == locale.LanguageIdentifier),
TypeText = system.Template.Type.TypeTexts.FirstOrDefault(typeText => typeText.Language == locale.LanguageIdentifier)
})
.FirstOrDefault(x => x.System.Id == request.Id);
The requirement is to retrieve the system matching the requested ID and load its project, template and template's type info. The template has multiple TemplateTexts (one for each translated language) but I only want to load the one matching the requested locale, same deal with the TypeTexts elements of the template's type.
The LINQ query above does that in one query and it gets converted to the following SQL query (I edited the SELECT statements to use * instead of the long list of columns generated):
SELECT [t1].*, [t2].*, [t5].*
FROM (
SELECT TOP(1) [p].*, [t].*, [t0].*
FROM [ParkerSystems] AS [p]
LEFT JOIN [Templates] AS [t] ON [p].[TemplateId] = [t].[Id]
LEFT JOIN [Types] AS [t0] ON [t].[TypeId] = [t0].[Id]
LEFT JOIN [Projects] AS [p0] ON [p].[Project_ProjectId] = [p0].[ProjectId]
WHERE [p].[SystemId] = #__request_Id_1
) AS [t1]
LEFT JOIN (
SELECT [t3].*
FROM (
SELECT [t4].*, ROW_NUMBER() OVER(PARTITION BY [t4].[ReferenceId] ORDER BY [t4].[Id]) AS [row]
FROM [TemplateTexts] AS [t4]
WHERE [t4].[Language] = #__locale_LanguageIdentifier_0
) AS [t3]
WHERE [t3].[row] <= 1
) AS [t2] ON [t1].[Id] = [t2].[ReferenceId]
LEFT JOIN (
SELECT [t6].*
FROM (
SELECT [t7].*, ROW_NUMBER() OVER(PARTITION BY [t7].[ReferenceId] ORDER BY [t7].[Id]) AS [row]
FROM [TypeTexts] AS [t7]
WHERE [t7].[Language] = #__locale_LanguageIdentifier_0
) AS [t6]
WHERE [t6].[row] <= 1
) AS [t5] ON [t1].[Id0] = [t5].[ReferenceId]
which is not bad, it's not a super complicated query, but I feel like my requirement can be solved with a much simpler SQL query:
SELECT *
FROM [Systems] AS [p]
JOIN [Templates] AS [t] ON [p].[TemplateId] = [t].[Id]
JOIN [TemplateTexts] AS [tt] ON [p].[TemplateId] = [tt].[ReferenceId]
JOIN [Types] AS [ty] ON [t].[TypeId] = [ty].[Id]
JOIN [TemplateTexts] AS [tyt] ON [ty].[Id] = [tyt].[ReferenceId]
WHERE [p].[SystemId] = #systemId and tt.[Language] = 2 and tyt.[Language] = 2
My question is: is there a different/simpler LINQ expression (either in Method syntax or Query syntax) that produces the same result (get all info in one go) because ideally I'd like to not have to have an anonymous object where the filtered sub-collections are aggregated. For even more brownie points, it'd be great if the generated SQL would be simpler/closer to what I think would be a simple query.
Is there a different/simpler LINQ expression (...) that produces the same result
Yes (maybe) and no.
No, because you're querying dbContext.Systems, therefore EF will return all systems that match your filter, also when they don't have TemplateTexts etc. That's why it has to generate outer joins. EF is not aware of your apparent intention to skip systems without these nested data or of any guarantee that these systems don't occur in the database. (Which you seem to assume, seeing the second query).
That accounts for the left joins to subqueries.
These subqueries are generated because of FirstOrDefault. In SQL it always requires some sort of subquery to get "first" records of one-to-many relationships. This ROW_NUMBER() OVER construction is actually quite efficient. Your second query doesn't have any notion of "first" records. It'll probably return different data.
Yes (maybe) because you also Include data. I'm not sure why. Some people seem to think Include is necessary to make subsequent projections (.Select) work, but it isn't. If that's your reason to use Includes then you can remove them and thus remove the first couple of joins.
OTOH you also Include system.Project which is not in the projection, so you seem to have added the Includes deliberately. And in this case they have effect, because the entire entity system is in the projection, otherwise EF would ignore them.
If you need the Includes then again, EF has to generate outer joins for the reason mentioned above.
EF decides to handle the Includes and projections separately, while hand-crafted SQL, aided by prior knowledge of the data could do that more efficiently. There's no way to affect that behavior though.
This LINQ query is close to your SQL, but I'm afraid of correctness of the result:
var resultSet =
(from system in dbContext.Systems
from templateText in system.Template.TemplateTexts
where templateText.Language == locale.LanguageIdentifier
from typeText in system.Template.Type.TypeTexts
where typeText.Language == locale.LanguageIdentifier
select new
{
System = system,
TemplateText = templateText
TypeText = typeText
})
.FirstOrDefault(x => x.System.Id == request.Id);

EF - Linq Expression and using a List of Ints to get best performance

So I have a list(table) of about 100k items and I want to retrieve all values that match a given list.
I have something like this.
the Table Sections key is NOT a primary key, so I'm expecting each value in listOfKeys to return a few rows.
List<int> listOfKeys = new List<int>(){1,3,44};
var allSections = Sections.Where(s => listOfKeys.Contains(s.id));
I don't know if it makes a difference but generally listOfKeys will only have between 1 to 3 items.
I'm using the Entity Framework.
So my question is, is this the best / fastest way to include a list in a linq expression?
I'm assuming that it isn't better to use another .NETICollection data object. Should I be using a Union or something?
Thanks
Suppose the listOfKeys will contain only small about of items and it's local list (not from database), like <50, then it's OK. The query generated will be basically WHERE id in (...) or WHERE id = ... OR id = ... ... and that's OK for database engine to handle it.
A Join would probably be more efficient:
var allSections =
from s in Sections
join k in listOfKeys on s.id equals k
select s;
Or, if you prefer the extension method syntax:
var allSections = Sections.Join(listOfKeys, s => s.id, k => k, (s, k) => s);

Wait for DomainContext.Load<t> from an entityquery with joins to complete (returning new type via 'select new')

My app consolidates data from other DBs for reporting purposes. We can't link the databases, so all the data processing has to be done in code - this is fine as we want to allow manual validation during the imports.
Certain users will be able to start an update through the Silverlight 4 front end.
I have 3 tables in database x that are fed from one EF4 Model (ModelX). I want to join those tables together, select specific columns and return the result as a new entity that exists in a different EF4 Model (ModelY). I'm using this query:
var myQuery = from i in DBx.table1 from it in DBx.table2 from h in DBx.table3 where (i.id==it.id && h.otherid == i.otherid) select new ModelYServer {Name = i.name,Thing = it.thing, Stuff = h.stuff};
The bit i'm stuck on, is how to execute that query, and wait until the Asynchronous call has completed. Normally, i'd use:
DomainContext.Load<T>(myQuery).Completed += (sender,args) =>
{List<T> myList = ((LoadOperation<T>)sender.Entities.ToList();};
but I can't pass myQuery (an IEnumerable) into the DomainContext.Load() as that expects an EntityQuery. The dataset is very large, and is taking up to 30 seconds to return, so I definitely need to wait before continuing.
So can anyone tell me how I can wait for the IEnumerable query to complete, or suggest a better way of doing this (there very likely is one).
Thanks
Mick
One simple way is just to force it to evaluate by calling ToList:
var query = from i in DBx.table1
join it in DBx.table2 on i.id equals it.id
join h in DBx.table3 on i.otherid equals h.otherid
select new ModelYServer {
Name = i.name,
Thing = it.thing,
Stuff = h.stuff
};
// This will block until the results have been fetched
var results = query.ToList();
// Now use results...
(I've changed your where clause into joins on the earlier tables, as that's what you were effectively doing and this is more idiomatic, IMO.)

LINQ subquery question

Can anybody tell me how I would get the records in the first statement that are not in the second statement (see below)?
from or in TblOrganisations
where or.OrgType == 2
select or.PkOrgID
Second query:
from o in TblOrganisations
join m in LuMetricSites
on o.PkOrgID equals m.FkSiteID
orderby m.SiteOrder
select o.PkOrgID
If you only need the IDs then Except should do the trick:
var inFirstButNotInSecond = first.Except(second);
Note that Except treats the two sequences as sets. This means that any duplicate elements in first won't be included in the results. I suspect that this won't be a problem since the name PkOrgID suggests a unique ID of some kind.
(See the documentation for Enumerable.Except and Queryable.Except for more info.)
Do you need the whole records, or just the IDs? The IDs are easy...
var ids = firstQuery.Except(secondQuery);
EDIT: Okay, if you can't do that, you'll need something like:
var secondQuery = ...; // As you've already got it
var query = from or in TblOrganisations
where or.OrgType == 2
where !secondQuery.Contains(or.PkOrgID)
select ...;
Check the SQL it produces, but I think it should do the right thing. Note that there's no point in performing any ordering in the second query - or even the join against TblOrganisations. In other words, you could use:
var query = from or in TblOrganisations
where or.OrgType == 2
where !LuMetricSites.Select(m => m.FkSiteID).Contains(or.PkOrgID)
select ...;
Use Except:
var filtered = first.Except(second);

Combining 2 Linq queries into 1

Given the following information, how can I combine these 2 linq queries into 1. Having a bit of trouble with the join statement.
'projectDetails' is just a list of ProjectDetails
ProjectDetails (1 to many) PCardAuthorizations
ProjectDetails (1 to many) ExpenditureDetails
Notice I am grouping by the same information and selecting the same type of information
var pCardAccount = from c in PCardAuthorizations
where projectDetails.Contains(c.ProjectDetail)
&& c.RequestStatusId == 2
group c by new { c.ProjectDetail, c.ProgramFund } into g
select new { Key = g.Key, Sum = g.Sum(x => x.Amount) };
var expenditures = from d in ExpenditureDetails
where projectDetails.Contains(d.ProjectDetails)
&& d.Expenditures.ExpenditureTypeEnum == 0
group d by new { d.ProjectDetails, d.ProgramFunds } into g
select new {
Key = g.Key,
Sum = g.Sum(y => y.ExpenditureAmounts.FirstOrDefault(a => a.IsCurrent && !a.RequiresAudit).CommittedMonthlyRecords.ProjectedEac)
};
I don't think Mike's suggestion will work here. These two queries are sufficiently different that combining them into one query will just make it more difficult to read.
The objects have different types.
The where clauses are different.
The group by clause is different:
new { c.ProjectDetail, c.ProgramFund }
new { c.ProjectDetails, c.ProgramFunds }
The select clauses are different.
In fact there isn't really anything that is the same. I'd recommend leaving it as it is.
You could always just concat them together before evaluating them.
pCardAccount.Concat(expenditures).ToArray()
should generate a single sql statement with a union. As to whether there is any benefit to this in your situation, I don't know. There's also a chance that linq-to-sql won't be able to generate the SQL for the concat and will throw an exception whenever you use it. I'm not sure what causes it, but I've seen it a few times when I was playing around in a similar situation.
Edit: I just noticed that your keys are different in each of the queries. I'm not sure if this is a typo or not, but if it isn't, you won't be able to concat them due to the different types and it wouldn't make any sense to anyway

Resources