LINQ with Group, Join and Where Easy in SQL not in LINQ? - performance

I have trouble understand how to translate SQL into LINQ. I would like to do the following but can't figure out how to get the Group By to work
var query = from s in Supplier
join o in Offers on s.Supp_ID equals o.Supp_ID
join p in Product on o.Prod_ID equals p.Prod_ID
where s.City == "Chicago"
group s by s.City into Results
select new { Name = Results.Name };
I just need to do something simple like display the product name of this simple query, how does the group by work with joins and a where?

You haven't provided classes so I assumed that they are like below:
public class Supplier
{
public int SupplierID { get; set; }
public string SuppierName { get; set; }
public string City { get; set; }
}
public class Product
{
public int ProductID { get; set; }
public string ProductName { get; set; }
}
public class Offer
{
public int SupplierID { get; set; }
public int ProductID { get; set; }
}
Then I added data for testing:
List<Supplier> supplierList = new List<Supplier>()
{
new Supplier() { SupplierID = 1, SuppierName = "FirstCompany", City = "Chicago"},
new Supplier() { SupplierID = 2, SuppierName = "SecondCompany", City = "Chicago"},
new Supplier() { SupplierID = 3, SuppierName = "ThirdCompany", City = "Chicago"},
};
List<Product> productList = new List<Product>()
{
new Product() { ProductID = 1, ProductName = "FirstProduct" },
new Product() { ProductID = 2, ProductName = "SecondProduct" },
new Product() { ProductID = 3, ProductName = "ThirdProduct" }
};
List<Offer> offerList = new List<Offer>()
{
new Offer() { SupplierID = 1, ProductID = 2},
new Offer() { SupplierID = 2, ProductID = 1},
new Offer() { SupplierID = 2, ProductID = 3}
};
If you want to show names of suppliers whiches products have been offered then your LINQ query should be as this:
IEnumerable<string> result = from supplier in supplierList
join offer in offerList on supplier.SupplierID equals offer.SupplierID
join product in productList on offer.ProductID equals product.ProductID
where supplier.City == "Chicago"
group supplier by supplier.SuppierName into g
select g.Key;
You can see if correct names have been selected:
foreach (string supplierName in result)
{
Console.WriteLine(supplierName);
}
It must give following result:
FirstCompany
SecondCompany

You could try this:
var query = from s in Supplier
join o in Offers on s.Supp_ID equals o.Supp_ID
join p in Product on o.Prod_ID equals p.Prod_ID
where s.City == "Chicago"
group s
by new {s.City, s.Name} //added this
into Results
select new { Name = Results.Key.Name };

You group s (Supplier) by s.City. The result of this is an IGrouping<City, Supplier>. I.e. only City and Supplier are within reach after the grouping: for each City you get an IEnumerable<Supplier> of its suppliers (which will be multiplied by the joins, by the way).
Since you also have the condition where s.City == "Chicago" grouping by city is of no use. There is only one city. So I think you may as well do something like this:
from s in Supplier
join o in Offers on s.Supp_ID equals o.Supp_ID
join p in Product on o.Prod_ID equals p.Prod_ID
where s.City == "Chicago"
select new {
City = s.City.Name,
Supplier = s.Name,
Product = p.Name,
...
};

Related

How do I use LINQ group by clause to return unique employee rows?

I'm pretty new to LINQ, and I can't for the life of me figure this out. I've seen lots of posts on how to use the group by in LINQ, but for some reason, I can't get it to work. This is so easy in ADO.NET, but I'm trying to use LINQ. Here's what I have that is relevant to the problem. I have marked the part that doesn't work.
public class JoinResult
{
public int LocationID;
public int EmployeeID;
public string LastName;
public string FirstName;
public string Position;
public bool Active;
}
private IQueryable<JoinResult> JoinResultIQueryable;
public IList<JoinResult> JoinResultIList;
JoinResultIQueryable = (
from e in IDDSContext.Employee
join p in IDDSContext.Position on e.PositionID equals p.PositionID
join el in IDDSContext.EmployeeLocation on e.EmployeeID equals el.EmployeeID
where e.PositionID != 1 // Do not display the super administrator's data.
orderby e.LastName, e.FirstName
// ***** Edit: I forgot to add this line of code, which applies a filter
// ***** to the IQueryable. It is this filter (or others like it that I
// ***** have omitted) that causes the query to return multiple rows.
// ***** The EmployeeLocationsList contains multiple LocationIDs, hence
// ***** the duplicates employees that I need to get rid of.
JoinResultIQueryable = JoinResultIQueryable
.Where(e => EmployeeLocationsList.Contains(e.LocationID);
// *****
// ***** The following line of code is what I want to do, but it doesn't work.
// ***** I just want the above join to bring back unique employees with all the data.
// ***** Select Distinct is way too cumbersome, so I'm using group by.
group el by e.EmployeeID
select new JoinResult
{
LocationID = el.LocationID,
EmployeeID = e.EmployeeID,
LastName = e.LastName,
FirstName = e.FirstName,
Position = p.Position1,
Active = e.Active
})
.AsNoTracking();
JoinResultIList = await JoinResultIQueryable
.ToListAsync();
How do I get from the IQueryable to the IList only returning the unique employee rows?
***** Edit:
Here is my current output:
[4][4][Anderson (OH)][Amanda][Dentist][True]
[5][4][Anderson (OH)][Amanda][Dentist][True]
[4][25][Stevens (OH)][Sally][Dental Assistant][True]
[4][30][Becon (OH)][Brenda][Administrative Assistant][False]
[5][30][Becon (OH)][Brenda][Administrative Assistant][False]
Actually you do not need grouping here, but Distinct. Ordering before Distinct or grouping is useless. Also AsNoTracking with custom projection is not needed.
var query =
from e in IDDSContext.Employee
join p in IDDSContext.Position on e.PositionID equals p.PositionID
join el in IDDSContext.EmployeeLocation on e.EmployeeID equals el.EmployeeID
where e.PositionID != 1 // Do not display the super administrator's data.
select new JoinResult
{
LocationID = el.LocationID,
EmployeeID = e.EmployeeID,
LastName = e.LastName,
FirstName = e.FirstName,
Position = p.Position1,
Active = e.Active
};
query = query.Distinct().OrderBy(e => e.LastName).ThenBy(e => e.FirstName);
JoinResultIList = await query.ToListAsync();
The problem is that few employees have more than one location is causing the results to be repeated.You can handle it in multiple ways. Im using Let clause to tackle the issue in the below example
public class Employee
{
public string FirstName { get; set; }
public string LastName { get; set; }
public int EmployeeID { get; set; }
public int PositionID { get; set; }
}
public class EmployeeLocation
{
public int EmployeeID { get; set; }
public int LocationID { get; set; }
}
public class Position
{
public int PositionID { get; set; }
public string Position1 { get; set; }
}
public class Location
{
public int LocationID { get; set; }
}
public class JoinResult
{
//Suggestion : Insetad of LocationID there should be a varibale that has all the locations of an employee
public IEnumerable<int> LocationIDs;
public int LocationID;
public int EmployeeID;
public string LastName;
public string FirstName;
public string Position;
public bool Active;
}
//Setting up mock data
List<Position> positions = new List<Position>();
positions.Add(new Position() { Position1 = "Dentist", PositionID = 2 });
positions.Add(new Position() { Position1 = "Dental Assistant", PositionID = 3 });
positions.Add(new Position() { Position1 = "Administrative Assistant", PositionID = 4 });
List<Employee> employees = new List<Employee>();
employees.Add(new Employee() { EmployeeID = 4, FirstName = "Amanda", LastName = "Anderson (OH)", PositionID = 2 });
employees.Add(new Employee() { EmployeeID = 25, FirstName = "Sally", LastName = "Stevens (OH)", PositionID = 3 });
employees.Add(new Employee() { EmployeeID = 30, FirstName = "Brenda", LastName = "Becon (OH)", PositionID = 4 });
List<Location> locations = new List<Location>();
locations.Add(new Location() { LocationID = 4 });
locations.Add(new Location() { LocationID = 5 });
List<EmployeeLocation> employeeLocation = new List<EmployeeLocation>();
employeeLocation.Add(new EmployeeLocation() { LocationID = 4, EmployeeID = 4 });
employeeLocation.Add(new EmployeeLocation() { LocationID = 5, EmployeeID = 4 });
employeeLocation.Add(new EmployeeLocation() { LocationID = 4, EmployeeID = 25 });
employeeLocation.Add(new EmployeeLocation() { LocationID = 4, EmployeeID = 30 });
employeeLocation.Add(new EmployeeLocation() { LocationID = 5, EmployeeID = 30 });
var result = (from e in employees
join p in positions on e.PositionID equals p.PositionID
let employeeLocations = (from el in employeeLocation where el.EmployeeID == e.EmployeeID select new { LocationID = el.LocationID })
where e.PositionID != 1 // Do not display the super administrator's data.
orderby e.LastName, e.FirstName
select new JoinResult
{
LocationID = employeeLocations.Select(p=>p.LocationID).First()//Here its just selecting the first location,
LocationIDs = employeeLocations.Select(p=> p.LocationID),//This is my suggestion
EmployeeID = e.EmployeeID,
LastName = e.LastName,
FirstName = e.FirstName,
Position = p.Position1,
}).ToList();
Okay. So here is the solution I came up with. I installed the morelinq NuGet package, which contains a DistinctBy() method. Then I added that method to the last line of the code shown in my problem.
JoinResultIList = JoinResultIQueryable
.DistinctBy(jr => jr.EmployeeID)
.ToList();

Joining multiple tables to return a list using LINQ

I have a LINQ statement I am trying to figure out - I am relatively new to this so please excuse my ignorance. I want to return a person list, each person having a list of interests.
The person(p) table joins to the personinterest(pi) table by p.id = pi.personid
The personinterest table joins to the interest(i) table, pi.interestid to i.id.
public class Persons
{
public int Id { get; set; }
public string LastName { get; set; }
public string FirstName { get; set; }
...
public IList<Interest> PersonInterests { get; set; }
}
public class Interest
{
public string InterestName { get; set; }
}
The class I am returning is a person, each with its PersonInterests list populated with 0 to many interests. The Linq statement I have below is returning data, but each person record is only getting one interest, and persons with more than one interest are having their person data duplicated as shown below the linq statement
var interests = _db.Interests;
return (from p in _db.People
join i in _db.PersonInterests on p.Id equals i.PersonId
join s in _db.Interests on i.InterestId equals s.Id
select new Persons{
Id = p.Id,
FirstName = p.FirstName,
LastName = p.LastName,
Age = p.Age,
Address = p.Address,
City = p.City,
StateAbbrev = p.StateAbbrev,
ZipCode = p.ZipCode,
PersonInterests = (from r in interests where r.Id == i.InterestId select r).ToList()
}).ToList();
Results:
{"id":1,"lastName":"Alexander","firstName":"Carson","age":23,"address":"123 4th Street","city":"Jamestown","stateAbbrev":"NV","zipCode":"65465","personInterests":[{"id":1,"interestName":"Basketball"}],"photo":null}
{"id":1,"lastName":"Alexander","firstName":"Carson","age":23,"address":"123 4th Street","city":"Jamestown","stateAbbrev":"NV","zipCode":"65465","personInterests":[{"id":2,"interestName":"Camping"}],"photo":null},
Instead of this, I would like the data to look like this:
{"id":1,"lastName":"Alexander","firstName":"Carson","age":23,"address":"123 4th Street","city":"Jamestown","stateAbbrev":"NV","zipCode":"65465","personInterests":[{"id":1,"interestName":"Basketball"}, {"id":2,"interestName":"Camping"}],"photo":null}
I have struggled with this for a while, any help is greatly appreciated.
So, after staring at this for hours I realized it was stupid to try and do this in one query. I split it up and got it working. First retrieved the person data, then for each person, built their list if interests.
public async Task<IEnumerable<PersonResource>> GetPeople()
{
IEnumerable<PersonResource> people = await (from p in _db.People
select new PersonResource
{
Id = p.Id,
FirstName = p.FirstName,
LastName = p.LastName,
Age = p.Age,
Address = p.Address,
City = p.City,
StateAbbrev = p.StateAbbrev,
ZipCode = p.ZipCode,
Photo = p.Photo,
Interests = new List<string>()
}).ToListAsync();
foreach (PersonResource person in people)
{
person.Interests = (from iint in _db.Interests
join n in _db.PersonInterests on iint.Id equals n.InterestId
where n.PersonId == person.Id
select iint.InterestName).ToList();
}
return people;
// return Mapper.Map<List<Person>, List<PersonResource>>(people);
}
do it like this
(from p in _db.People
select new {
Id = p.Id,
FirstName = p.FirstName,
LastName = p.LastName,
Age = p.Age,
Address = p.Address,
City = p.City,
StateAbbrev = p.StateAbbrev,
ZipCode = p.ZipCode,
Photo = p.Photo,
Interests = (from i in _db.Interests
where i.PersonId == p.Id
select i.InterestName).ToList()
}).ToList();

Get tuple and list linked as result

I have this query :
var query = (from tables ...
where ...
select new
{
ClientName = ClientName,
ClientNumber = ClientNumber,
ClientProduct = ClientProduct
}).Distinct();
which returns rows with 3 values.
ClientName and ClientNumber can be linked to multiple products.
So we can have :
NameA NumberA Product1
NameA NumberA Product2
NameA NumberA Product3
NameB NumberB Product4
NameC NumberC Product5
I would like to know if it is possible to store that in a List of a certain class which would be like :
class MyClass
{
string ClientName,
int ClientNumber,
List<int> ClientProducts
}
So there are no duplicate of ClientName and ClientNumber.
Thank you in advance.
With this class structure to represent your data:
class MyClass
{
public string ClientName { get; set; }
public int ClientNumber { get; set; }
public List<int> ClientProducts { get; set; }
}
class Procuct
{
public string ClientName { get; set; }
public int ClientNumber { get; set; }
public int ProductID { get; set; }
}
and this test data:
List<Procuct> Products = new List<Procuct>()
{
new Procuct() { ClientName = "A", ClientNumber = 1, ProductID = 1},
new Procuct() { ClientName = "A", ClientNumber = 1, ProductID = 2},
new Procuct() { ClientName = "A", ClientNumber = 1, ProductID = 3},
new Procuct() { ClientName = "B", ClientNumber = 2, ProductID = 4},
new Procuct() { ClientName = "C", ClientNumber = 2, ProductID = 5}
};
you can use the following linq query:
var q = from p in Products
group p by new
{
cName = p.ClientName,
cNumber = p.ClientNumber
} into pGroup
select new MyClass
{
ClientName = pGroup.Key.cName,
ClientNumber = pGroup.Key.cNumber,
ClientProducts = pGroup.Select(x => x.ProductID).ToList()
};
to get exactly what you want, i.e. a collection of MyClass objects.
The Grouping performed in the above linq query essentially guarantees that there will be no duplicates on (ClientName, ClientNumber).
Since you mention Linq-to-sql, most probably you Client entity already has the products linked. You might look for an overcomplicated solution.
It depends a bit on your foreign key stucture, but if your datamodel would be
Client has 1-many product and you have a Foreign key from product to client it is already present.
So you can just reference client.Products.
So in your case it would be
var query = (from Clients...
where ...
select new
{
ClientName = Client.ClientName,
ClientNumber = Client.ClientNumber,
ClientProduct = Client.Products.Select(s=>s.id).ToList()
});
But you might as well simply use your client entity with a eager load of the products.
It all depends on your datamodel + proper foreign key structure
if you have a many-many associations like Product-per-client between your client and product you can start from that entity. Have a look at this documentation - it provides a good starting point for Linq-2-sql.
http://weblogs.asp.net/scottgu/using-linq-to-sql-part-1
I solve same problem , I think it useful to you
Only check your Where Condition properly
Thank...
var query = (from tables ...
where ...
select new
{
ClientName = ClientName,
ClientNumber = ClientNumber,
ClientProduct = ClientProduct.ToList()
}).Distinct();

Linq to objects, joining to collections and simplify select new

I need some help to simplify a linq query. I have 2 classes Invoice and Customer.
The Invoice have a property CustomerId and a property Customer.
I need to get all invoices and include the Customer object.
I don't like my query, as it needs to change if new properties are added to the Invoice object.
I can't join the invoice and customer earlier than this stage so that is not an alternative.
My query.
var customers = GetCustomers();
var invoices = GetInvoices();
var joinedList = (from x in invoices
join y in customers on x.CustomerId equals y.CustomerId
select new Invoice
{
Amount = x.Amount,
CustomerId = x.CustomerId,
Customer = y,
InvoiceId = x.InvoiceId
}).ToList();
The classes
public class Invoice
{
public int InvoiceId { get; set; }
public double Amount { get; set; }
public int CustomerId { get; set; }
public Customer Customer { get; set; }
}
public class Customer
{
public int CustomerId { get; set; }
public string Name { get; set; }
}
private static IEnumerable<Invoice> GetInvoices()
{
yield return new Invoice
{
Amount = 34,
CustomerId = 1,
InvoiceId = 1
};
yield return new Invoice
{
Amount = 44.7,
CustomerId = 1,
InvoiceId = 2
};
yield return new Invoice
{
Amount = 67,
CustomerId = 2,
InvoiceId = 3
};
yield return new Invoice
{
Amount = 89,
CustomerId = 3,
InvoiceId = 4
};
}
private static IEnumerable<Customer> GetCustomers()
{
yield return new Customer
{
CustomerId = 1,
Name = "Bob"
};
yield return new Customer
{
CustomerId = 2,
Name = "Don"
};
yield return new Customer
{
CustomerId = 3,
Name = "Alice"
};
}
Why not just a simple foreach loop:
// Dictionary for efficient look-up
var customers = GetCustomers().ToDictionary(c => c.CustomerId);
var invoices = GetInvoices().ToList();
//TODO: error checking
foreach(var i in invoices)
i.Customer = customers[i.CustomerId];

How do I aggregate joins?

I have three related tables:
Employee(EmployeeId, EmployeeName)
Skill(SkillId, SkillName)
EmployeeSkill(EmployeSkillId, EmployeeId, SkillId)
EmployeSkillId is an identity.
The rows in the database are the following:
Employee Table:
EmployeeId | EmployeeNumber | EmployeeName
---------- | -------------- | ------------
1 | 10015 | John Doe
Skill Table:
SkillId | SkillName
------- | ---------
1 | .NET
2 | SQL
3 | OOD
4 | Leadership
EmployeeSkill Table:
EmployeeSkillId | EmployeeId | SkillId
--------------- | ---------- | -------
1 | 1 | 1
2 | 1 | 2
3 | 1 | 3
4 | 1 | 4
If I have an employee who has three skills registered on EmployeSkill, I'd like to be able to have a result as the following:
John Doe, "Skill-1, Skill2, Skill-3"
That is, to concatenate the name of the skills for that employee into a single string.
I tried the following, but it isn't working.
var query = from emp in Employee.All()
from es in emp.EmployeeSkills
join sk in Skill.All() on es.SkillId equals sk.SkillId
group sk by new {emp.EmployeeName} into g
select new TestEntity
{
Name = g.Key.EmployeeName,
Skills = g.Aggregate(new StringBuilder(),
(sb, grp_row) => sb.Append(grp_row.SkillName))
.ToString()
};
The aggregated list of skill names is coming back empty. How can I do this?
It sounds like you could do the join as part of the select:
var query = from emp in Employee.All()
select new TestEntity {
Name = emp.EmployeeName,
Skills = string.Join(", ",
(from es in emp.EmployeeSkills
join sk in Skill.All() on es.SkillId equals sk.SkillId
select sk.SkillName)) };
Now that's going to do the join separately for each individually, which isn't terribly efficient. Another option is to build a mapping from skill ID to skill name first:
var skillMap = Skill.All().ToDictionary(sk => sk.SkillId,
sk => sk.SkillName);
then the main query is easy:
var query = from emp in Employee.All()
select new TestEntity {
Name = emp.EmployeeName,
Skills = string.Join(", ",
emp.EmployeeSkills.Select(sk => skillMap[sk.SkillId]))};
Ultimately there are lots of ways of skinning this cat - for example, if you wanted to stick to your original approach, that's still feasible. I would do it like this:
var query = from emp in Employee.All()
from es in emp.EmployeeSkills
join sk in Skill.All() on es.SkillId equals sk.SkillId
group sk.SkillName by emp into g
select new TestEntity
{
Name = g.Key.EmployeeName,
Skills = string.Join(", ", g)
};
At this point it's quite similar to your original query, just using string.Join instead of Aggregate, of course. If all these three approaches come back with an empty skills list, then I suspect there's something wrong with your data. It's not obvious to me why your first query would "succeed" but with an empty skill list.
EDIT: Okay, here's a short(-ish) but complete example of it working:
using System;
using System.Collections.Generic;
using System.Linq;
public class Employee
{
public int EmployeeId { get; set; }
public string EmployeeName { get; set; }
public static List<Employee> All { get; set; }
public IEnumerable<EmployeeSkill> EmployeeSkills
{
get
{
return EmployeeSkill.All
.Where(x => x.EmployeeId == EmployeeId);
}
}
}
public class Skill
{
public string SkillName { get; set; }
public int SkillId { get; set; }
public static List<Skill> All { get; set; }
}
public class EmployeeSkill
{
public int SkillId { get; set; }
public int EmployeeId { get; set; }
public static List<EmployeeSkill> All { get; set; }
}
class Test
{
static void Main()
{
Skill.All = new List<Skill>
{
new Skill { SkillName = "C#", SkillId = 1},
new Skill { SkillName = "Java", SkillId = 2},
new Skill { SkillName = "C++", SkillId = 3},
};
Employee.All = new List<Employee>
{
new Employee { EmployeeName = "Fred", EmployeeId = 1 },
new Employee { EmployeeName = "Ginger", EmployeeId = 2 },
};
EmployeeSkill.All = new List<EmployeeSkill>
{
new EmployeeSkill { SkillId = 1, EmployeeId = 1 },
new EmployeeSkill { SkillId = 2, EmployeeId = 1 },
new EmployeeSkill { SkillId = 2, EmployeeId = 2 },
new EmployeeSkill { SkillId = 3, EmployeeId = 2 },
};
var query = from emp in Employee.All
from es in emp.EmployeeSkills
join sk in Skill.All on es.SkillId equals sk.SkillId
group sk.SkillName by emp.EmployeeName into g
select new
{
Name = g.Key,
Skills = string.Join(", ", g)
};
foreach (var result in query)
{
Console.WriteLine(result);
}
}
}
Results:
{ Name = Fred, Skills = C#, Java }
{ Name = Ginger, Skills = Java, C++ }

Resources